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The Association of Computing Machinery's Administrative Di- 
rectory for 1979 lists 207 American computet science clepartroents 
granting bachelor's degrees, 127 granting master's degrees, and <73 
offering Ph.D. or-D.Sc. degrees. In addition, the jdirectory* includes 
computer science programs at all levels embedded in 163 math- 
ematics departments, 56 busines^ schools, 29 electrical engineering 
departments, and 40 other schools .or departments, including such ? 
diverse , areas as phy$ics, industrial engineering, and economics* 

v What makes these figures remarkable is the fact that the first com- 
puter science department appeared less than two decades earlier.- 
To me, this rapid grpwth is but one of several, factors that combine. 

l. to place computer science in an exceptional position vis-i-vis other 
areas of inquiry: A'brief exploration of thesfe points will be helpful 
'in pfoviding some general perspectives within which the articles in 
this study can be considered; ^ k 

First, it i$ important to note^that the burgeoning of cojnputer 
science programs cannot'be equated with tHe.maturafipiri)Tlcom^" 
puter science. There sttlfis no "standard" (ie., universally inoffen- 
sive) definition ofepmputer science* In fact, the existence of such a 

. " __ } * 

* The Encyclopedia of Computer Science (A. Ralston and . C. Meek, eds., Pet- 
rocelli/Charter, 1976) defines computer science as follows: "Computer science is 
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discipline cdhtinucs to, be a dcbatabl6 point "for a substantial*. 
nuijibcr o(< people. (A prominent educator, though himself ^ 
-chairman of a computer science department, cautions his audiences 
to, regard with suspicion any discipline with "science" in its name — 
urban Science, consumer science, economic science, social science*; 

. .: computer science.) Sortie people think of computer science'^ "uni- 
verse" as\a relatively restricted one, limited by definition to elec- 
tronic digital information processing systems. Definitions at the 
other end of* the spectrum perceive ah arena consisting of an arbi- 
trarily wide range of information processing systems, includingjbio* 
logical ones, Respite this diversity, the digital c6mputcr system is 
clearly the dominant vehicle for, study. Mildly stated, this is a very * 
unusual Situation: Instead of exploring the behavior of a cell,* a 
fluid/an organisms or a galaxy, the computer scientist seeks basic 1 
observable phenomena from an artifact (i.e., the computer itself or 1 
the program proccs^gthcrcin). Thus^ the quest fjor'"natural lajvs" 
carries little meaning\here. There is no ultimate and final reality 
against which explanatory structures are to be assessed. The 
"reality," .rcprescnitccrbj^^ software, and 

information, is arbitrarily alterable. Though it may sound almost 
facetious, the fact remains' that, if an attemp^ (regardless of its 
degree of formalism) falls fhort of explaining observed events, tfiose 
events (i.e., reality) car* be changed 'to meet the explanation Tialf-. 
' way. Inevitably, this has a profound effect on the phenomena com- 
puter scientists seek to descrit^and the ways in which such de- 
scriptions are voiced. 

A second major peculiarity lies in computer science's inherent 1 
invisibility. End products of comp\iterscience,4te., information pro- 
cessing systems, generally are used\and rabtivated) by people with 
• little* interest in computer, science, major objective in the im- 
plementation of such products' is to obscure their inner workings so 
that the user's' attention remains focused on the externally per- ' 
ceived behavior. For ^instance, a well-dqsigried translating system 

~ _ fof a high-level programming language (such aS-FORTRAN) suc*- 
, « ■ [ V ; A : 

* ' \ ■ « ■ :• ' 

concerned' with information processes, with the information structures and pro- 
cedures that enter into representations of such processes \and with their implemen- 
tation and information processing systems. It is also concerned with relationships 
- between informatkm processes^nd^classes of^asks that give Vise to them.** & ~ 
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ccssfully promotes the illusion that the uscr> prbgrams execute 
directly on a FORTRAN computer, with' no apparent Intervention 
between the program (as written) and the ^machinery. (This is cm- " 
phasized by the jargon, which terms such a system "transparent" to 

' its users.) Conversely, implcmcntors of such systems gencrinly arc 
less concerned with the ultimate uses than they are with the techni- 
cal issues pertaining to a system's design and behavior. As system 
objectives bqeame mbre ambitious, their requirements grew more 
intricate, thereby generating increasingly complex design and or- 
ganizational problems. Not surprisingly, many of these problem^ 
when abstracted, began to take on 'a separate existence, atrracting 
many people interested in pursuing them withiirtheir own context. 
The Resulting dichotomy 'produced a situation ip which a consider- 
able amount of creative and intellectually exdting work lias been 

• done with little or no connection to events in the field. Although it 
is labeled "computer science theory," it hp^ not been motivated by 
actual^ computing problems,*hor has it exerted any substantial in- 
fluence '6n computing practice. Rather, the* Breakneck pace \that 
characterizes the advance of computer applications has been fueled 
for ' the ijnost .part by a loosely interwoven fabric of empirical tech- 
nologies. . ■ „• ^ , . 

The*extent<to which formal computer science and applied com-, 
puter technology will continue on essentially independent paths 
remains a matter of speculation. There arc strong signs that certain 
areas of formal inquiry (silch as computational com^exfty, formal 
languages, and automata theory). already are beginning to have an 
impact on the design of a variety of information processing pro- 
cedures. (Thi^, in part; motivates some of the emphasis on. these 
areas, in this Study.) Moreover, the, realities of continued growth in . ; 
computer applications militate for more concerted efforts to pro- 
mote such, interactions. The conceptual demands imposed by many 
of these new systems are beginning to fall beyond the range of 
complexity that can be accommodated without more generalized 

'm odels and applicable formal structures. How ever, a n emerging 
pattern of bridge-building between abstract computer science and 
applied computing technology remains to be defined. 

Another important aspect of computer science relates to the 
computer's pervasiveness. The indoor record for platitudes may 
well' be held by variants of the statement that computers have 
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touched cvcry/aspcct of human endeavor, Overworked as this 
chcstnUt may be, it has long stopped being a hyperbole. While a 
considerable amount of this' proliferation is reasonably attributable 
to ft sustained, impressive selling effort, a significant impetus lias 
come from computer science, even as the discipline was seeking its 
♦identity. Despite the almost chaotic diversity that charactprized' 
early perceptions of computer science,- one fundamental concept 
emerged whose profound clTcct..on the breadth of computer appli- 
cations was felt almost immediately: This was nothing more (more 
precisely, we shduld say "nothing less") than the realization that the; 
computer is primarily a symbol manipulator, with numerical calcu- 
lation being but one relatively specialized activity. The primitive 
manipulations that arc specifiable oh acofnputcr receive their con- 
text from the algorithm designer. Thus we can define any system 
we choose to define, representing its components via an arbitrary 
set of symbols and imbuing that system with a set of procedural 
characteristics consistent with our definition? When these symbols 
arc processed, the "rficaning" of the operations pcrfpnficd dn them 
derives strictly from pur inception. The staggering result, then, is 
that the computer presents itself as a vehicle for controlling (man- 
aging) any procedure (hat we arc able to describe precisely. 

Self-evident as this recognition may appear, it all but escaped 
any notice by many earljf computer users. Howeyer, once substan- 
tial interest was aroused , in examining computer procedures and 
\languagcs for specifying such procedures, the revolution began irt 
earnest and.it persists unabated ^'quarter of a century later, (ionj- 
pUter usage continues to proliferate, exerting a. feedback' effect dn 
the directions of hardware and software development. • ' n 

It is this mutual interaction between computer usage and 
computer-oriented research and engineering* that tys helped en- 
gender the enormous collection of applicable technology. At the 
same time, this process has catalyzed (to a substantial degree) the 
identity problems that continue to complicate computer science's 
movement toward ad^thcwd^When computer applications are de- 
veloped in a particular .area, the resulting effecfon that field may 
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be quite profound. For example, basic directions in thermodyn- 
amics have been altered by making "exotic" computations roitfinc 
and previously "impossible" computations plausible, In a number 
of disciplines, these effects have been so fundamental that it seemed 
natural 'to associate computers and computing with those, dis- 
ciplines* Thus, cbmputcr-oriQrttcd workers in> say, numerical analy- 
sis found it reasonable to treat studies in computing as being in- 
exorably linked with those in numerical analysis. Then, whenjt 

» came to define an academic program, its natural habitat (for those 
workers) was clearly in tliat .area. This situation replicated itself to 
a sufficient extent so that as recently as 1974 the National Science 
Foundation deemed it necessary to affirm^ computer science as 

,Jbcing distinct from other disciplines. / 

.These perplexing characteristics make it all the more difficult to 
identify a group of concerns that indisputably are "cpntrqP' to com- 
puter science and therefore represent inevitable candidates for in- 
clusion in a compendium such as this one. Consequently, c^ntrality 
(by whatever criteria) was not the only motivation in the sclcctidn 
of these articles. Considered also was the desirability of empha- 
sizing topics that would be particularly interesting to mathema- 
ticians aqd woujul underscore the wide range of areas deeply and 
continually affected by computer science. ' 
: The collection resulting from these deliberations consists of nine 
articles. After a historical overview of the major factors motivating 
the development of computer sicence, there are three'articles deal- 
ing .with aspects of program* and the programming process,*Rcgar- 
dless of the objectives of a given computerized' information hand- 
ling process, the inescapable fact is that the generation of any 
observable phenomena involves 'the execution .of a progVani. 
Consequently, any study claiming to pay attention to jthe computer 

\ science's mainstream issues must examine an interlocking chain of 

v goncegts and technologies that indudes the characteristics of pro- 
gramming languages, design and implementation of algorithmic 
translators for these languages, and the analysis of programs writ- 
ten in these languages. Accordingly, am article;by William' E. Ball 
provides la systematic examination of programming languages as. 
communication vehicles for expressing algorithms. Corresponding 
language processors are then examined as means for analyzing and 
translating these programs, stated ir^ terms convenient to the user, 
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into the more primitive focms Required for execution on ^com-^ 

puter. - . > T. : '" . . • ■ - \ 

The pivotal role of programming languages ^has stimulated aV 
growing interest in the study of their features arjd properties within . 
' the broader context of fprmal language theory. Mathematical 
; models developed for the exploration of natural language's are scf v- 
ing as effective .bases for describing and characterizing computer 
iartgua^s as. well Results stemmiilg from this Work have provided 
ipvaluable mights that are : exerting ponsidjeia&bie influence on the 
direction of nevy language designs and implementations. The. scope 
■ and impact of thes^fe inquiries, together with' the underlying theo- 
retical framework, are discussed in an article by V. Book on. the 
specification of formal languages. : . - ■ V . *■ 

, ^ork on programs and programming has engendered another 
major avenue of formal study, this one dealing witlUne programs 
themselves. While there are many people who still are convinced 
' that the process of designing and implementating computable algQr 
rithms is characteristically resistant , to the imposition of any sub- 
stantial formalism, a growing number, of investigators holding the 
opposite conviction have madejmportant strides toward defining 
basic analytical tools for the formal description of a program and 
its behavioral properties. In an article on formal analysis of pro* 
•grams, T- W*. Pratt discusses, the major directions this, work >has 
taken and defines realistically the prospects for a theoretical basis 
that will make possible the systematic generation of optimally con- 
structed, probably correct programs. 

Another focal point in the study of computational processes ad- 
dresses the computations themselves. Closely intertwined with the 
properties ofe^rogramming languages and the behavior of algo- 
rithms, but still quite distinct/ is. a set of concerns regarding the 
characterization of algorithms in terms of their complexity. The 
development of theoretical structures for assessing precisely the 
"■' relative difficulty of competing solution methods, well beyond the 
^contemplative stage, is examined in an aritcle by F: P. Preparata. 
The last* four articles deal with areas that are of great interest 
nd serious concern to computer science but wnose position rel- 
tive to the "cpre" of the discipline continues to be debated. For 
e, areas such as artificial intelligence, numerical analysis, statis- 
and simulation are important applications of computer/sci- 
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ence Others take a diametrically opposite view* perceiving these 
fields ,as : Separate disciplines in, ^hich computer &ieAce plays a 
Significant but supportive tole (as electron -microscopy, might .be : 
perceived in relationno genetics or metallurgy). Still others may . 
think of spine or all oY these areas as' 1 branches of comjiuter science 
itself. Irrespective of one's orientation, there is an undeniably per- 
vasive 'attribute shaped by such fields: All of them, have been 
alteredj profoundly by their interaction with computing and com- 
puter science! " . V 

Perhaps the most conspicuous example of a branch of knQwl- 
edge in which computers have provided primary impetus is that of 
artificial intelligence. Machine-oriented work has produced drama- 
tic results in pattern recognition, theorem proving, game playing, 
and 5 iother cognitive processes that have affected the structure* and 
capabilities of information processing systems in several important 
ways. More fundamentally, achievements : in artificial intelligence 
ha\/e prompted^ some unsettling questions 'about the nature and 
constancy of the boundaries between human and machine intelli- 
gence. Accordingly, this area is addressed in a comprehensive arti- 
cle by JamesR. Slagle. 

|Both numerical analysis and statistics have been: revolutionize^ 
by the sheer mass of computational power that can be marshaled 
to attack problems in these areas.* Methodologies which in thc d 
past could only be described, are now part "of the; standard 
repertory; classes of algorithms whose potential computational' re- 
quirements; had placed them beyond contemplation are now ac- 
tively pursued, designed, implemented, and used. One) basic effect of 
Jfeis available computational power has produced a most pro voca- 
; tive consequence in both these areas: Traditionally, many of the 
assumptions underlying a wide range of statistical and numerical 
algorithms were accepted ... implicitly because ; of the extensive 
amount of computation required to test them and the even more 
formidable work required to introduce corrections (or alternative 
fnethodologies) should the tests reveal a violation. 



V Numerical analysis, in particular, has emerged with renewed protawience and 
continues to experience vigorous growth. In response, a separate volurheih this 
series is being planned to characterize ihis field. ■> ■ 
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In an article on the inipact of computers on numerical analysis, 
E. R. Bulgy and R. H. Pennington explore these effects as related. ft> 
a number of majot areas, including function evaluation, numerical 
quadrature, and systems of equations. Rarticular emphasis is placafi 
on ways of exploiting more sophisticated numerical algorithmic 
whose jise for improving approximations and (educing error-build- 
up no Ibnger can be eschewed based on computational difficulty. 

Digital simulation exemplifies an area whose basic spectrum has 
been widened dramatically by the introduction of digital com- 
puters. Mark Franklin discusses the range of analytically inacces- 
sible systems whose exploration has been made possible by the 
development of models based on continuous and discrete simula- 
tion. In addition, the article presents a discussion of pertinent mod- 
eling and validation processes, along with a discussion of ajgo- 
y rithms designed to specify these models and special languages for 
implementing'such algorithms. 

In his treatment of computational tools for statistical data analy- 
sis, C. F. ^Starmer examines a basically analogous set of effedts 
resulting irom simplifying assumptions thaf.tend (artificially) to. 
coalesce inherently different entities into groups defined as being 
\"the same" for purposes of computational convenience. At a more 
intrinsic level, there are numerous applications in which the combi- 
nation of computationally 4 effipjent algorithms and economical 
automatic data collection techniques bring into serious question 
the continued use of estimated parameters (forced on the statis- 
tician because, of sampling) instead of the exactness provided by 
gathering data from an entire 'population. 

In a study such as thifrone, it is futile to "hope for cqmpletenessf 
Accordingly, there are inevitable gaps, several of them quite con- 
spicuous. For example, the study T of functional organization of 
computing systems known as computer architecture, is an impor- 
tant computer science subject. Moreover, its prominence is increas- 
ing^ because "dramatic advances in equipment technology have 
broadened the range of practical- configurations so that the practi- 
cing computer scientist is faced with an almost arbitrary range of 
feasible options for a systerti configuration. These options include 
architectures in whieh a number of computers are linked together, 
to allow the subdivision of an application into several simul- 
taneously executing components/However, opportunities to exploit 
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this parallel processing are Severely restricted "because there is only 
a rudimentary understanding o'f the underlying phe'noirfena. Conse- 
quently; there are growing areas of inquiry aimeii at finding formal 
vehicles for characterizing such pro<^sses 3 Janguages^ 
them, and algorithms for. generating tliemfV^ equivalent sequen-v 
tial processes. v v 

These and other aspects of computer science.may be the subjects 
of additional studies in the future. Meanwhile, it is hoped that this 
study provides an> interesting look : at an explosive field still in the 
process of becoming. - . - . 

Seymour V. Pollack 
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Anytime one undertakes to chronicle the development of any 
human endeavor, it is tempting to climb' on Time's ascending bal- 
loon so that the receding past appears more orderly. The license 
granted by this "perspective" offers the convenience of reinterpret- 
ing certain occurrences, or perhaps laying others aside as .the teem- 
ing jumble fades into a systematic patchwork of related but wpll- 
bounded entities. Events, one can~clain>, foreshadow events, and we 
develop a case study in inevitability. v ^ 

When it comes- to computing and. computer 'science*- in their 
present state, it is easy to stand there, eyes ahead* and proclaim 
successful resistance: The temptation to make things orderly simply 
is not there. Computer science, after all, deals with our desire to ; 
understand the transformations occurring in an information pro- 
cessing system (more precisely, ao> electronic digital computing 
system). And this amazing contrivance, though now ubiquitous, has 
been with us barely a third of a century. (The "amazing" has not 
yet worn off.) Moreover, its arrival was riot accompanied immedi- 
ately by universal recognition that we have, here a device whose 
study warrants (much less demands) a full-blown "science;" com- 
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pjete with .university departments, jobrnajs, esoterica, international 
conventions, anfj ptber* legitimizing accoutrements. Consequently, ' 
we are 'looking at «. very new area of inquiry, .with, direct fore- 
runnersjimited to 'a handful~5f strikingly prophetic individuals. The 
imaginative" people produced ideas for machines that were far nibre 
tjian automatic calculators : Not only-would^ they follow* a ^equence 
of prescribed instructfbhs, but tHose^ instructions^ could be altered 
by the machine itself. -It was not until a century after one man's 
initial insights that a second wave of visionaries, abetted by a new 
technology, a greater urgency, and a more responsive bureaucracy,; 
revived tljese concepts, brought them to' fruition, and set the stage 
for Hie emergence of a distinct area of study. 

Additional background, less direct, must beguiled together from' / 
a variety? of responses to man's search for efficient means,of gener-/? 
ating data to alleviate a shortage and, paradoxically, his growing/ 
.need tQ organize and handle an overabundance* of data. While 1 not ( 
direct antecedents of computer science, these diverse sources have^, 
provided motivation for some of the equipment methddologies^li^^ 
have been crucial to its development. This is an important pbinit^ 
for it underscores the central notion 'that computer science deals, 7 
inevitably, bo,th with the machines themselves and with the vehicles^ 
for expressing,, implementing, analyzing,' and explaining processes 
meant to qperaj^n those machines. (Many people in the field 
emphasize this duality by using the name "computing science" in 
preference to "computer science";', perltstent use of the latter terrrv 

here does not imply any lesser emphasis.)* 

. • *• '-■.*... 

THE SEARCH FOR DATA 

7 ... • ,. •• : . 

It is no surprise that the widening interest iri computers and 
computing has prompted detailed inquiry into the history ancf de- 
velopment of the entire range of calculating machinery. Accord- 
ingly, these studies are turning up a kaleidoscope of dgvices and 
mechanisms to add to the already fascinating array of better- „ 



,* In some views, computer science is more aptly germed "information science" 
because its domain is seen to encompass all information processing systems, includ- 
ing biological ones. »''-."*' 
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known ma<&ines v SeveraI' ^ accounts ard cited at 



the e,nd of this article. 0f particular ftote is B. Randell's The Origin 
of Digital Comp/iters^l]^ in which a feeling of 

cjrama is 'intensified J^|rtiqluding reproductions of many * of the 
original papers and a^iJAts. •>/ > 

* vMarty of these m^|Klries were' designed in response to &n emerg- 
ing negd for reliabji^ata, spurred by the growth of tommerce and 
the ^ak;ening pf^iiropean science and technology. Napier's in- 
* ventlm ofjog^nt^s (1614); crucial ais.it was iti revolutionizing,the 
wa^_ of de^Tiii^with numerical calculations, required the pro- 
^pon^of^^rlthmic tables, before it could be exploited. Such 
^ies, labon^ay built by hand, bristled 0 with errors. Better celes- 
pbs^^^^s began to provide data which, in conjunction with 
4 |y^ theories thus engendered, promised major advances 
tijgmy and navigation. Here again, these advances would 
promises without the extensive set£ of computed tables - 
lei them. Even the.simple additions and subtractions used 
j>f lading and. other commercial documents began to place 
itable burdens on the time and "endurance* of the human 
btlciatis / trying to meet^grpwing demands" while maintaining 

^r^us^there'' was no lack of motivation for reliable, tireless calcu- 
lating machines. (The abacus, though used extensively and rou- 
tinely in'tfU^East* was known in Europe but never went beyond the 
"interesting toy" stage there.) In 1642 the 19-year pld Blaise Pascal 
built the prototype of a machine which added and subtracted, seek- 
ing to provide computational relief .for his father's customs work. 
This was the first in a long and varied process of digital calculators 
invented (and reinvented) throughout the ensuing three centuries. 
.Excellent summaries are to be found in Bowden [2], Goldstine [3], 
and Eames [4]; with more detailed chronicles being cited therein. 

By the early . nineteenth century, the use of mathematical tables ' 
had grown considerably, but the few mechanical calculators then 
Available had little impact; oq the preparation of these tables. (Nei- 
tKfer their speed nor reliability ;baused sustained excitement.) Conse- 
quently,'^ promise of accurate tables still was unfulfilled at that 
N time. One of the people angered by these deficiencies was Charles 
Babbage. In 1812 this young English mathematician (then 20) de- 
cided to seek a mechanized vehicle for computing function values. 
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He proposed to simplify the automation process by using the 
method of differences, ari^pproach that \yoiiId allow the evaluation 
of a polynomial function at systematic intervals without involving 
anything more complicated; than addition. This is illustraied belpw 
for Y=,X 2 + N X + 4i by^evaluating Y for X == 0,1, 8. This 
•-probably is'the function Babbagefirst used tp r present His ideas": » , 
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It is clear from the table that, by starting with the constant second- 
order difference (nth orddx for a polynomial of degree n), one could 
apply successful addition and come up with corresponding function 
values at successive integer values of the variable. Moreover, the 
constant nth order_differ.ejice can serve as a verifier. F6r example, a 
Y value of 175 for X = 1J would projjrfce a corresponding D2 of 4, 
indicating an error. Procedures already had been laid out whereby 
similar tables were mass-produced by squadrons of computers (the 
term by which such people were known then) who were set to work, 
each person performing a particular addition in the cycle. Ver- 
ification did hot seem to be a particularly widespread custom, as 
manifested by the numerous errors to be found in such tables. This 
is likely to have provided additional motivation for Babbage, who 
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planned a mechanism for handling up to sixth-order differences 
with an accuracy of 32 decimal places. A responsive goVenjment^ 
provided Babbage with support for the.prpject in 1824. Work pro- 
ceeded in fits and starts^until support ultimately was withdrawn in ~ 
1842> Despite ill the setbacks, .a'wQrkirig difference engine was 
built by^abba^e's son, demonstrated, and used fcrr m&ny years by . 
the British gjpternment and by their insurance industry. M^lreover, * 
it provide^ the inspiration for a more^odest different engine - 
(capable of producing fourth-order differences to thirteen places) 
produced by Scheutz in Sweden in 1854 and used productively for 
many years. The method of differences, and "engines" based pn it, 
remained useful for table preparation well into the twentieth cen- 
tury. In 1928', Leslie J. Comrie, then Deputy Superintendent of the^ 
Royal Naval College's Nautical Almanac Office* used a yariety of 
mechanical calculators as difference -engines to produce tables rang- 
ing frdm'Bessel functions to nautical tables of lunar data. Thus we 
find in these difference engines, and in the quest for reliable, data, 
something approaching a continuous thread through the "progres- 
sion" from calculating to computing machines. Moreover, it served 
as the impetus for the surge of interest in digital calculations that 
eventually produced the first, automatic electronic computers. In- 
terestingly, this need for rapid, voluminous, accurate numerical 
computations also helped lay the groundwork for the early and 
persistent .misconception that computer science is In intriguing but 
relatively limited^ffshoot of numerical analysis. ' 

THE* ANALYTICAL ENGINE 

While providing valuable new insights into the ingenuity and 
diversity of early calculating machines and difference engines, re- 
newed inquiries <*also have served, to reconfirm* the uniqueness of 
Charles Babbage's position as .tfie unmistakable "father" of the 
automatic digital computer. As he pursued the development of his 
•! difference engine, 'Babbage already £egan to be -bothered by its 
inherent dedication to a very specific task, i.e., computation of a 
table of values for a polynomial function. With the engine still 
incomplete (1833), he envisioned its replacement by a more general v 
"analytical engine" which, by following appropriate instructions, 
would produce values for any function. To an amazingly accurate 
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extent, the .fundamental concepts governing the organization and 
behavior of today's -digital systems are emb&died in Babbage's 
plans for his analytical engine: ftis design specified a machine tfiat 
would follow a sequence of instruction^ submitted m to it. Those 
instructions would activate computational components, accept 
input data/ and produce* human compatible (i.e.,^printed) output. 
More significantly, results produced at some 'interim point in the 
process would dictate the nature of the machine's subsequent ac- 
tivity by selecting a!n alternative pathway of instructions for the 
machine to follow. . , - . 

Of course the idea of self-regulatingjievices did not originat^ 
with Babbage By the time he wfis thinkirfg through some of the 
ideas for his. analytical engine, James Watt's steam engine, reg- 
ulated by a ball goyernor, v already was an established device and 
the principle of the thermostat had been worked out. Scientific and ; 
technological histories of the nineteenth century are abundantly , 
dotted with instances of such devices in which a poftion of the 
output is fed back to tfie input side, thereby regulating the system's 
behavior. In some instances, this "self-guiding" property * has 
prompted the efforts to represent these mechanisms as forerunners 
of computers. However, tempting as this analogy might be, it is a ( 
wide and serendipitous jump from these single-purpose mecha- 
nisms to Babbage's dazzling idea of a general'vehicle in which the^ 
operation (as perceived by the users) changes with each new set oH 
^automatically sequenced instructions. 

The great frustration of Babbage's life was that he never saw his • 
analytical engine design move from the drawings to reality. The 
" device itself was to be a very ambitious one: Its storage unit, to.be 
'implemented by a means of pegged cylinders, would accommodate 
up to 1,000 50-digit decimal numbers. Motivated by instructions 
supplied on punched cards (more about these later); selected num- 
bers would be brought automatically to the central computational 
mechanism (which Babbage called the mill) Where arbitrarily, speci- 
fied combinations of arithmetic operations would be performed. 
Results, transferred from the mill to storage, eventually would find 
their way to a printing device which, again automatically, would 
proHuce human readable output directly or prepare a plate from„ 
which copies could be made. All of this activity was to be purely 
mechanical, driven by steam power. The accompanying tolerance 
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requirements were so stringent that they fell beyond the manufac- 
turing eapabUities then current. It was not Until decades later that v 
his son, General Henry . Babbage, was able o to demonstrate the 
s soundness of his -fathers theories by successfully assembling and - « 
operating a subset of the mill, equipped with a printing mechanism. 

There is no (discernible) continuous thread fromJBabbage to the 
early pioneers in electronic digital computing. In- fact, Babbage 
himself left relatively little in the way of documentation. Recbgni- . 
tion of his colossal intellectual achievement in this regard stems 
primarily from an annotated translation by his associate CQuntess 
Lovelace (Lord Byron's daughter) of a detailed description written 
by an Italian engineer named Menebrea.* These two individuals 
seem to comprise the population of Babbage's contemporaries who 
understood the impact of his invention. Several subsequent designs 
for analytical engines, independently conceived, have surfaced in 
response to renewed historical interest and are described in [1] and' 
[2], 

THE PROBLEM OF ABUNDANT INFORMXTION 

While the scientific and commercial communities of the nine- 
teenth century industrial world began to deal with' their growing 
need for reliable data, another facet of this growth produced a 
somewhat contrasting problem — the prospect of a deluge of infor- 
mation that could not be digested for meaningful use. For instance, 
the required decennial census of the United States was beginning to 
produce so much data that there was barely enough time to pre- f 
pare and disseminate basic summaries before the next census. By 
the 1880's the problem became sufficiently acute to prompt the 
Bureau of the Census to seek ways of mechanizing the tabulation 
for the 'forthcoming 1890 census. (Some mechanlfcal aids were intro- 
duced, to a limited extent, as far back as the 1870 census, but only 
rudimentary assistance was provided, a far cry frQm the automated 
tabulation being^sought:) At the urging of Drr John Sr Billings, 
head of planning for the 1880 census, a young bureau employee 
named Herman Hollerith de\eloped and patented an electric tabu- 

' ~ * — " J 

* The paper is included as Appendix 1 in [2]. 
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Elating system in' which data Were represented as a series *)f fioles Qh 
a continuous length of perfo^d^paper, or on individual punched 
* c^ifrds. Both Hollerith and Batfbage may have drawn their inspira- 
"ij; tibh. for punched card usago^from aVommon source — the Jacquard 
loom of 1804, in which punched papfer cards controlled the selec- 
tion of strands used to weave patterns. After successful applications 1 
k *o regional data in Maryland, New Jersey, and New York (1887— 
• : - 1889), Hollerith's machinery forspunching and counting data was 
used successfully for the 1890 census, during which well over 50 
million cards were punched. 

This success was repeated in a^number of Eyropean census oper- B 
ations, including an Austrian census in 1890 ajid a.Russian census 
in 1897. Improved keypunching equipment, among other Inno- 
vations, heightened the<success of the 1900 U. S. census, after Which 
* the Bureau of the Census began producing its own 1 equipment 
-' under the leadership of James Powers. Hollerith's original com- 
pany tihe Tabulating Machine Company) becajne part of the 
Computing-Tabuiating-RecoTding Company, the direct forerunner 
° of the International Business Machines Corporation. Powers eventr , 
ually Jeft thp census bureau to form his own company. This en- 
terprise merged with Remington Rand, which ultimately merged 
with Sperry Gyroscope. Thus, the patents granted to these two men 
- provided' the basis for competing punched card systems,- with 
Remington /Rand (by then the Univac Division of Sperry ^Rand) 
, vgiving up thV^host on its card design in the late 1960's. 

The success of punched card equipment , in handling masses of 
census data was recognized quickly and exploited in a variety of 
commercial enterprises with afcundiant data problems (e.g., insur- 
ance companies, railroad companies, public Anilities, large retailers). 
By 1910 the possibilities of Hollerith's machines (which by then 
were able to add as well as sort and tabulate) already had prompt- 
ed the systematization of many financial and managerial pro- 
cedures. Jn addition to automating what has been done manually 
—by groups of clerks r this equipment engendered applications^which- 
had no equivalent manual predecessors. Thus were born such pro- 
cesses as cost analysis arid sales analysis. More importantly, there 
emerged a recognition that these information gathering and pro- 
cessing devices were agents of change— catalysts^pr reexamining 
and redefining ways of doing things. 
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^CONVERGENCE OF TECHNOLOGIES 



As usage of tabulating equipment spread, calculating machinery 
experienced a parallel growth. By the time the twentieth' century 
entered its second decade, calculators such as Steiger's "million- 
aire" and Odhner's Brunsviga numbered in the.thousands and were 
to.be fdutfd in routine usevin widely varying business and scientific ■ 
contexts. At the same time, machines like Felt's. Comptometer and. 
■Burroughs' Adding and Listing Machine were making appearances 
oujside of their customary accounting environs. However, only a 
small amount of computational ability crept, into the design of, 
i Hollerith's, and ^Powers' tabulating equipment. What now appears 
(with 20-20 a hindsight) to be a "natural" marriage between machines 
able to generate sizable amounts of data and those capable of 
organizing and handling such data was not so apparent in 1910. 
Thus we find Karl Pearson, who was a gigantic factor in the spread 
of statistical applications, preparing his massive tables with a 
simple Brunsviga calculator. . *^ 

It was not until the 1920's that L r J. Comrie, an official of tne 
British Nautical Office (who had learned about calculator^ 
Pearson), began his systematic exploitation of tabulating^ 
ment for scientific purposes. Departing from "normal' 1 
such equipment,-*fcomrie applied Babbage's different engi| 
niques oi^severaLcommercial machines, among them Holfl 
tabulators. The results, which included a set of greatly impr? 
nautical tables, underscored Comrie's important insight: By seek- 
ing approaches designed specifically to exploit the properties of 
these machines, he emphasized a quest for newly achievable appli- 
cations in contrast to those representing a direct carry-over d 
manual procedures. 

A similar synthesis occurred at Columbia University under the 
guidance of Benjamin D. Wood, an educational psychologist Sens- 
ing the importance of statistical analysis in educational appli- 
cations,, he persuaded IBM's Thomas J. Watson (in 1928) tp lend 
the university a number of tabulating machines, thereby providing' 
the- basis for Columbia's Statistical Bureau, the first of its kind. . 
devoted to education. Working closely with IBM, Wood was in- 
strumental in helping revolutionize educational testing by making 
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it economically feasible on a large scale and greatly expanding 
pertinent methodology. 

In 1931 Columbia's statistical laboratoff attracted the attention 
of Wallace J Eckert, an astronomer, who began using the compu- 
tational facilities for his work. By 1933 this usage had developed to 
a sufficient extent to prompt the establishment of a separate com- 
putational laboratory for astronomy. This facility, later to become 
the Astronomical Computing Bureau, was equipped with IBM's 
lktest tabulating and accounting equipment, which Eckert tied to- 
gether by means of -a "mechanical programmer" that Allowed the 
execution of a succession of steps, automatically. Later^ on, the 

^Astronomical/Computing Bureau was to play a significant sup- 
porting role in several major projects during World War II, include 
irig the Manhattan Project. Eckert, meanwhile, left Columbia to 
assume the directorship of the Nautical Almanac Office, where, he 
continued to apply his techniques for adapting punched card ma- 
cliinery to scientific computing. The numerous tables thus prodr 
uced were soon to be applied to a variety of wartime uses.y \ 
While it is impossible to pinpoint all the specific dates or events, : 
it is clear that the combined use of data processing equipment and 
ele.etromechanical calculators in a single context marks the be- 
ginning of a steadily accelerating movement" tdwalrd electronic 
digital computers. In^the-specific case of Columbia University's 
Agronomical Computing Bureau, this facility, served^as a catalyst 
that^lleiped move IBM decisively in the direction of scientific com- 
puting. With continued corporate support, (the laboratory event- 

^tf£lly (1937) became the Thomas^J. Watson Astronomical Com- 
puting Bureau. While its primary interest was focused on astron- 
omy and astronomical applications, the bureau served as a cradle 
for . ideas in scientific computing which were to have substantive 
effects on other -computing projects. Some appreciation of the 
bureau's central role can be gained from Herman H. Goldstine's 
excellent account of those, days [3]. One quickly builds a percep- 
tion of a rich and turbulent atmosphere in which people having 
sopie contact with the bureau keep showing up in various other 
seminal computing projects. 

The bureau's. original Board of Managers included T. Hr Brown 
of Harvard. Consequently, when Howard Aiken, then a Harvard 
graduate student, expressed a strong active interest tin digital com- 
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puting, Brown sent him to spend some time with . Eckert at the 
Watson Computing Bureau. The result was a collaborative project 
between IBM and Harvard begun in 1939 and culminating in 1944 
with an electromechanical digital computer known as the IBM" 
Automatic Sequence Controlled Calculator, and eventually termed 
the Mark I. The computer was capable of storing 72 sighed 23-digit 
decimal numbers as well as' 60 manually set constant values. Its 
machinery enabled it to perform a multiplication in about six sec- 
onds? More significantly, a string of instructions, supplied to Mark, 
I on perforated paper tape, permitted the execution of an arbitrary 
sequence of operations without intervention. L. J. Comrie hailed 
the Computer as a realization of BabbagQ's dream. 

While the Harvard-IBM project was taking shape, an indepen- 
dent undertaking with basically similar intent was going on at Bell 
■ Telepiionfe Laboratories Under the leadership of George R. Stibitz. 
Using telephone relays for storage,. Stibitz's group developed a ma- v . 
chine fgr performing arithmetic on complex numbers (1940). Inr 
struct^Shs and data were introduced via teletype, either through 
direct connection or by long distance telephone. Stibitz astutely 
recognized the advantages of using binary arithnfetic and designed 
the Complex Number Computer (as it was called) to use a binary 
coded decimal system very similar to that still employed. The ma- 
chine, used routinely till 1949, served as a basis for more ambitious 
digital computer' projects at Bell Laboratories, establishing that 
organization as an early and continuing .contributor to the new 
computer technology. ~* 

Under different circumstances, these, electromechanical com- 
puters would have generated considerable excitement. &fter all, 
here were untiring workers tha,t could compute reliably at ratcg 
inany times faste/ than possible heretofore. However, the timing 
was unfortunate in that these devices appeared jUst as electronic 
technology was beginning to mature. As a result, much of the 
impact was neutralized and the electromechanical computer's 
major role became Jthat of predecessor. • ■ 

A number^f other individuals and organizations recognized the 
great potential inherent in combining data processing arid compu- 
tational technologies and^made the appropriate intellectual leap.. 
There is no intent here to obscure their work and deny them due 
credit. The purpose, rather, is to emphasize the importance of this 



technological marriage in providing an essential stepping stone 
into the computer age. 

BALLISTICS AND BALLISTICS RESEARCH 

In outlining the convergence of major forces to produce elec- 
tronic digital computers, we must go back in time to pick up 
another important soiree of motivation. Unfortunately, Jthis source 
stems from our seemingly unrelenting desire to hurl lejthal projec- 
tiles at one another. In pursuing this somewhat bizarre approach to 
population control, there always has been a need to develop me- 
thods s for calculating Conveniently and accurately the landing 
points of various bodies\dropped on, flung, fired, Or launched at an 
enemy. As weapons became more sophisticated, such information 
began to take the form of! elaborate tables with thousands of trajec- 
tories to account for a correspondingly bewildering array of projec- 
tiles; launching conditions, and external factors. Accordingly, a sub- 
stantial segment of scientific commuting, effort throughout the cen- 
turies has been devoted to the development of theoretical njodels 
ami practical, methodologies that would permit such tables *to be 
produced with reasonable effort. Thus it is not surprising to find 
prominent mathematicians throughout history associated with ar- 
tillery boards and other ordnance agencies. 

World War I saw a concerted attempt to place the American 
ballistics effort on a sound scientific basis. This included the estab- 
lishment of proving grounds staffed with highly competent individ- 
uals drawn together from a variety of disciplines, who, through a 
combination of incisive theoretical work and well-planned experi- 
ments, produced notable improvements in the accuracy of ballistic 
calculations. Instrumental in ofte major program was Forest Ray 
Moulton, who modified the finite difference methods used in 
astronomy and applied tfiem successfully to ballistics compu- 
tations. Of perhaps equajv^ignificance is the fact that he persuaded 
the armed forces to sponsor a program wherein talented officers 
could pursue graduate work in mathematics and physics with spe- 
cialties in ballistics. This helped reinforce the idea of an ongoing 
ballistics research effort. \ . 

Concurrently, a secpnd ballistics program was set up under the 
leadership of Oswald Veblen, who Ahelped place the newly es- 
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tablished Institute for Advanced Study at Princeton University. 
Moreover, he formed the association bfitoeenato^bedy-and-the- 
American ballistics research. effort, thereby setting the stage for an 
extremely fruitful collaboration. Specifically, it was Veblen who 
brought John Von Neunjann to the institute and later placed him 
(as well as many others) in contact with the problems surrounding 
the ballistics work at the Aberdeen Proving -Grounds, This, of 
course, was of profound importance in the subsequent development 
of computers and computer science. 

The definition' and verification of improved ballistic theories 
prompted ah increased desire for machinery on which the new 
solutions could be implemented. Since much of this work involved 

i the solution of differential equations, the researchers' attention was 
caught by Vannevar Bush's differential analyzer, an elaborate 
analog device designed to solve complex se,ts of differential equa- 
tions encountered in the analysis of electrical network flow. While 
similar devices had been built to analyze specific problems (in 

0 effect, by serving as scale models of the particular systems being 
investigated), Bush's machine, built at the Massachusetts Institute 
of Technology, was much more general, being constructed to 
handle a wide variety of problems expressible in terms of differen- 

). tial equations. 1 
The analyzer's effectiveness also caugh^ 
of Pennsylvania's Moore School of Ele^tricalEngineering. As a 
result, arrangements were made for Bush and his colleagues to 
build an analyzer for each of these two organizations. Installation 
occurred in 1935.' Thus the sphere of associations between the 
American ballistics research efforts and university scientists con- 
tinued to expand. 

Although the Bush analyzer served as a very useful vehicle in 
solving ballistics differential equations, its effectiveness was seen to 
lessen as problems grew more and more demanding. This became 
especially noticeable with increases in required computational pre- 
cision. Since an analog instrument is an embodiment of a model 
basgd on continuous matti&natics, any' increase in the model's pre- 

r cision requires a cbrresponding increase iifetfie instrument's precis- 
ion as well., For example, if some continuous variable in the model 
is represented by a voltage in the instrument, another significant 
figure in the variable would mean a tenfold improvement in the 
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_voltage!s^ccuraq^he -limited-ability-of-an-analog-system-toac^ 
commodate these new requirements prompted a shift in attention 
back to digital solutions and digital technologies. Thus, early 
controversy as to whether the analog or digital approach would 
predominate in machine-oriented scientific computing gradually 
gave way to a realization that the arbitrary level of precision made 
possible by using discrete symbols to represent physical values 
would force the dominance of. digital approaches by "natural selec- 
tion." Hence digital computations became the focus around which 
computing and computer science eventually developed. There no 
longer is an "analog versus digital controversy." The former ap- 
proach, now implemented with contemporary electronic tech- 
nology, still finds important use where its particular strengths can 
be fruitfully exploited. 

Aberdeen's quest for effective digital computing machinery led i{ 
to ask Bell Laboratories for an expanded and improved version of 
its Complex Number Computer The result was the Ballistics Comr 
puter. Installed in -1944, it had' roughly three times the capacity of 
its predecessor and could " operate automatically for extended 
periods (up to 24 hours)'on instructions submitted via paper tape. 
This was superseded by a much larger computer (the Model V) 
designed for more general use. In addition to supporting the work 
af Aberdeen (where it was installed in 1947), this marine was used 
for a variety of ballistics applications both for the Navy; and Air 
Force. Bell continued to develop arid improve relay computers, 
predominantly for* its own internal use, until they were superannu- 
ated by the onrush of electronics. \ 

IBM's interest ^scientific computing, nurtured Tn part through 
its collaboration with Harvard and Columbia universities, played a 
part in American ballistics work as well. Thus, by 1944, numerous 
IBM standard arid specially designed digital devices wdre to be 
found in several ordnance installations including Aberdeen, where 
they were used in-computing bombing and firing tables. 

THE EMERGENCE OF THE ELECTRONIC DIGITAL COMPUTER 

The onset of World War II intensified greatly the need for ballis- 
tics tables. Briefly stated, the gunner (normally) knows the location 
of his target, the' characteristics of his gun and projectile, hnd cerr 
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t^iTweather conditions firing table takes these into account and, 
for a given set of conditions, specifies- angles of deflection and 
elevation/Typically, each entry in a firing table describes a particu- 
lar trajectory, and a firing table for a particular piece of artillery 
may contain on the order of 3,000 such trajectories. At that time, 
the Bush analyzer (which still could meet precision requirements 
for most cases) and Stibitz's;relay**computer operated at about the 
same speed : On either system it took somewhere between 10 to 20 
minutes to perform the several hundred multiplications required to 
produce a single one-minute trajectory. At that rate, each system 
/was capable of turning out a firing table in about a month. Since a 
: steady supply of these tables had to reach the men at the fronts, the 
Ballistics Research Laboratory mobilized for the effort. The ana- 
lyzer at the University of Pennsylvania was absorbed into the over- 
all -project, and scores of people were trained, to operate the sys- 
tems. Under these circumstances, it is n9 surprise that laboratory, 
personnel were constantly on the lookout for faster machines, and 
research efforts continued to improve the body of ballistics theory 
itself. 

By mid-1942 a • number of people already had -concluded that 
many electromechanical calculating circuity could realistically be - 
supplanted by functionally equivalent electronic, components; 
Among the strongest advocates were John W. Maughly, a physicist 
at the Moore School, and J. Presper Eckert, Jr., a graduate student 
at that institution. Their strong arguments in favor of electronics as 
a practical means for increasing computational speed helped con- 
vince Aberdeen's Ballistics Research Laboratory that the time was 
right to embark on the development of an electronic, calculator 
and, in June, 1943, the Moore School received a contract to prod- 
uce such a device for the Laboratory. The machine was to be called 
Electronic Numerical Integrator And Computer (ENIAC). En- 
thusiasm for the project was not universal: Proponents of electro- 
mechanical computers maintained that the desired increase in 
speed was; obtainable via electromechanical means, with much 
greater reliability, by partitioning the work among Several units 
operating concurrently. Others felt ' that similar improvements^ 
could be realized by applying electronic technology to analogs 
equipment. Nor were ENIACs principals under any illusion: They . 
fully recognized the enormous reliability problem in producing/ 
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such a complex device (plans called for about 18,000 vacuum tubes 
alone). Yet the heed for higher computational speed overrode these 
apprehensions and the project was approVed. 

ENIAC was not completed until 1946, but it worked, typically 
reducing trajectory computation time byji factor of 30. 

The impact of ENIAC extended wejjPbeyond the computational 
improvements effected at the Ballistics Laboratory. For example, 
the need to provide ENIAC with appropriate input/output facilities 
without undergoing another major development^ effort prompted 
Mauchly to contact IBM and engage their help in adapting sQme 
of their tabulating equipment for this purpose. As a result, ENIAC 
wa^ equipped to read from punched cards and to punch new cards 
compatible in format with available IBM printing devices. Thus 
IBM, "already heavily committed to scientific computing through its 
association with the laboratories at Harvard and Columbia, 
became an early, participant in & major move toward electronic 
computing. 

• Another major impact is well worth discussing; Although the 
ENIAC was being developed for ballistics work, it was fully in- 
tended to make the machine available for other applications. In 
fact, the first full problem actually run on the new computer was 
one whose solution was needed by the Los Alamos Laboratory. 
(Part of this run was incorporated subsequently into the official 
demonstration.) In addition,^ variety pf runs were implemented, 
including some in aerodynamics, hydrodynamics, pure mathemat- 
ics, and weather prediction: This was done even while ENIAC still 
was installed at the Moore School; When the transfer to Aberdeen 
was complete and the machine was operating again (in the summer 
of 1947), general usage continued, with the machine serving £ 
widening comniunity well into the 1950's. In spite of the mainten- 
ance difficulties arid other complications, the machine generally 
performed successfully, thereby leaving a very deep impression 
among its users regarding the future of electronic digital computers 
in scientific work. 

Thus the historical importance of ENlAC is well established. 
Questions remain, howevej, regarding its status as the ^^ elec- 
tronic computer. Despite the fact that many of the principal figures 
still are alive and have been interviewed numerous times (a col- 
lection of such taped interviews has been given to the Smithsonian 
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Institution and the the Museum of Science in Kent), the picture 
remains clouded and chaotic. A good deal of the confusion was 
resolved in 1972 with the anpearance of Herman H. Goldstine's 
detailed chronicle [3]. This iya particularly important work in that 
it was written by one of the/principals, the author having served as 
the Army's chief direcMctmnical participant in the ENIAC project 
and its major successors. But ambiguities still remain regarding the 
"correct chronology and connectivity. While ENIAC was in 
progress, Konrad Zuse, having started with a home-built relay 
computer in the 1930*s, steadily improved his machinery to a point 
where the German Air Ministry in 1943 ordered several of his 
general-purpose calculators for aerodynamic work.' Zuse went on 
to establish his own successful cbmputer company. A ^major effort 
also was carried on in England, but much of the work in the early 
1940's was kept secret, dealing as it did with special purpose com- 
puters for cryptographic use. A number .of working electronic digi- 
tal devices were produced under these auspices (Randell has repor- 
ted on some of these [1]), but the more general work' appears to. 
have received its crucial impetus from the ENIAC project. Three 
major British computer projects were the National Physical Lab- 
oratory's Automatic Computing Engine (ACE), Cambridge Univer- 
sity's Electronic Delay Storage Automatic Calculator (EDSAC), 
and Manchester University's Manchester Automatic Digital Ma- 
chine (MADM). Principals associated with thQse projects visited 
the Moore School in 1945 to see the ENIAC nearing completion 
and tp exchange ideas about improvements in its design. Bowden 
[2] provides an excellent account froih the British perspective, 
r At the same time* John V. Atanasoff, a physicist and mathema- 
tician at Iowa State, acted on his conviction that electronic digital 
computation was the proper approach toward automatic com- 
puters and began building such components. By 1941 Atanasoff's 
work had attracted sufficient attention so that an interested John 
Mauchly (prior to a his association with the Moore School) visited 
Atanasoff and held extended discussions with him. Numerous other 
related projects have been identified in various searches for a com- 
plete and accurate log of computer development; however, the in- 
tensive scientific and technological activity characteristic of the' 
World War II years^ make this already complex task all theniore 
difficult. Ironically, even at this writing litigation still is pending 
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regarding which computers preceded which othSr ones and who 
invented what. Historians seeking to link these early electronic 
computer efforts to Babbage's work in somfc substantive way have 
been generally unconvincing. The^ fact that some of the- crucial 
documentation remains classified adds further plot thickener. Also, 
the Russians still have not published the authoritative history of 
computers, in which the first electronic digital machine will be 
proved to be the work of a physicist in Odessa following principles 
defined by a Ukrainian mathematician 152 years earlier. Thus such 
controversy is likely to continue. 

Beyond controversy is ENIA^'s role as a major poiht of depar- 
ture in the evolutioh of electronic digital computers. Once the con- 
tinued development of ENIAC was established as being a matter of 
engineering and not of basic feasibility, plans were initiated for a 
direct successor. Primary orientation toward ballistic^ calculations, 
together with the imposition of a tight development schedule, left 
the ENIAC design with a number of recognized shortcomings., 
Consequently, the ENIAC group began an explorati&n of the kinds 
of techniques and components that could be used to produce a 
machine that would store a lot more information using substan- 
tially fewer vacuum tubes and would be easier to convert fr(?m one 
application to another. (The ENIAC had to be rewired each time.) 
A new research and development effort was recommended, the 
resulting computer to be called EDVAC (Electronic Discrete Vari- 
able Automatic Computer). The EDVAC is of particular interest in 
our context because of its role as a focal point around which many 
basic structural concepts first were articulated [3]. 

By 1944, John Von Neumann", already a consultant to several 
government laboratories, including Aberdeen, had become deeply 
interested in the ENIAC project, having perceived the enormous 
value of electronic digital computers in a wide range of applications 
on which he was working. Consequently, it was natural for him to 
become an active participant (with Goldstine's encouragement) in 
the EDVAC project. His role in that capacity was pivotal, both in 
terms of the project itself and with regard to its impact on subse- 
quent thinking about computers: Fundamentally, Von Neumann 
began looking at computers as logical (rather than electrical) de- 
vices. Once this all-important premise \was established, he or- 
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ganized and defined the logical functions which became a basis for 
specifying and analyzing computer operations irrespective of how 
those operations are implemented electrically. These precepts were, 
expressed in the first draft of a report written by Von Neumann 
and issued^by thp Moore School in 1945 [5]. Somehow that draft 
never was revised and ultimately it became the nucleus of an ongo- 
ing dispute (still unresolvjgj in many people's minds) about the 
exact origins of its contents. Regardless of who contributed what, 
the draft still stands as perhaps the most definitive single document 
about computers. One of its most crucial aspects is the enunciation 
of the idea that, the EDVAC (and computers in general) should be 
equipped with electronically alterable storage components capa- 
cious enough to accommodate the data used by the computer in 
solving a given problem and the instructions guiding the computer 
in effecting that solution. (The ENIACs memory held twenty 
values and instructions were prewired.) Moreover, each type of 
instruction was to be expressed as a particular numerical code, with 
the computer's control mechanism equipped to recognize these 
codes and to determine when it was dealing with such codes and 
riot with data. The implication, then, was that the same memory 
could be used to hold both types of information and the instruc- 
tions could be modified as easily as the data. Unlike the ENIAC, in 
which replicate components were provided to allow certain calcu- 
lations to proceed concurrently (this was tailored especially for 
ballistics "Work), Von Neumann perceived the computer as a com- 
pletely serial logical machine in which each individual instruction 
would be accessed from the machine's memory, analyzed (i.e., de- 
coded), and executed in distinct sequence. Unless signaled to do 
otherwise, the machine's control unit automatically would execute 
these instructions in the order in which they were stored. (Inciden- 
tally, this last principle was one of the many points of contention 
between Von Neumann and others in the EDVAC project which 
contributed to the eventual breakup of the original group. Con- 
trary to Von Neumann and Goldstine's strong recommendation, 
each EDVAC instruction was designed to specify explicitly the 
address in storage containing the next instruction to be executed.) 

Significantly, the formulation articulated in Von Neumann's 
draft continues to serve as the standard model for digital com- 
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putcrs. Even now, despite drastic changes in physical circuitry and - 
overall performance, the overwhelming majority of computers are 
Von Neumann machines. ' 
*. EN1AC spawned a number of major computer projects besides 
the EDVAC. Cambridge University's EDSAC, though it drew on 
the ENIAC work and emulated many of EDVACs concepts, actu- 
ally was completed (in 1948) before EDVAC dnd is acknowledged 
to be the first stored program electronic computer, Eckert and 
Mauchly formed their own computer company in 1946 and pro- 
duced the first U. S.-made stored program machine, the BINAC 
(Binary Automatic Computer) for Northrop Aviation in 1950. 
(EDVAC itself was delivered to the Ballistics Research Laboratory 
later that year.) They went on to build the first completely general- 
purpose electronic computer, UNIVAC (Universal Automatic 
Computer), which was turned over to the U. S. Census Bureau in. 
1951. Not long thereafter, the company became part of Sperry 
Rand. 

One of the most prominent visitors to the Moore School was Jay 
W. Forrester, a co-founder of MIT's servomechanisms laboratory. 
At work on a project to build a computer for a flight simulator, 
Forrestor's concepts were influenced profoundly by his obser- 
vations in Pennsylvania and the flight simulator evolved into 
MIT's first major digital computer, the Whirlwind. Forrester went 
on to found that institution's Digital Computer Laboratory and to, 
invent ferromagnetic core storage, a medium for computers' main 
memory components that lasted without serious challenge from 
alternative technologies until the 1970's. 

ENIAC engendered another project, in its way. perhaps more 
significant than the EDVAC. Even as EDVACs concepts were 
taking shape, concerns were raised about the future of computer 
resedrch and development once the impetus of war receded. Al- 
though the Moore School came to mind as the logical site for 
subsequent efforts, commitments to ENIAC and EDVAC, coupled 
with a variety of other factors, prompted other centers to be con- 
sidered. After much deliberation, Princeton's Institute for Ad- 
vanced Study was selected to house a major computer research 
project, to be conducted under peacetime auspices. Von Neumann 
returned to the Institute (1945) as project director and he was 
joined by Goldstine in the next year. A number of others came to 
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the project from the Moore School, government, industry, and 
from the Institute itself. . 

Like its predecessors and contemporaries, the Institute's project 
was centered around the projection of a machine (the IAS com- 
puter). However, the machine was not earmarked for any single, 
.application, nor was it destined for some specific organization. In- 
stead, funded from a variety of sources, it was envisioned as a 
research instrument available to support an arbitrary range of 
scientific and mathematical inquiries, including those relating to its 
own construction and behavior. As a result, the Institute issued 
detailed reports about the machine's organization and construc- 
tion. These were widely disseminated, thereby making available an 
abundance of fundamentally useful information well in advance of 
the machine's actual completion. As a result, the IAS project cata- 
lyzed a' number of others and many of these were well under way 
prior to completion of the IAS computer itself. These included 
MANIAC at Los Alamos, ORDVAC and ILLIAC at the Uni- 
versity of Illinois, and major systems at RCA and IBM. 

Thus, by 1947, the concepts allowing a unified functional percep- 
tion of computers were well established and fairly widely recog- 
nized, so that a substantial number of projects already were in 
progress by 1950. Moreover, most of the systems around which 
these projects were oriented were intended for general use, with 
increasing emphasis on facilitating the changes from one use to 
another. Dramatic speed advantages over electromechanical prede- 
cessors, amply demonstrated in a variety of specific contexts, pro- 
vided an additional contribution to a climate which (with the 
wisdom of hindsight) was suitable for a ^technological and com- 
mercial revolution. , 

BASIC COMPUTER ARCHITECTURE 

Before attempting to trace the growth in awareness that the 
logical and procedural phenomena in a computing system warrant 
serious study, it will be useful to elaborate a little on the basic 
functional components of a computing system. This will provide a 
frame of reference against which one can mirror the concepts and 
insights developed in the other articles, toward this end we shall 
develop a discussion around a model that (basically) characterizes 
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Fia. l. Major components of a computing system. 

most of today's configurations' and is very similar to the one put 
fprth for EDV AC and the IAS computer. 

As shown schematically in Figure 1, a computing system is com- 
prised of four fundamental components: 

l. An arithmetic/logical unit (ALU) in which is embodied the 
circuitry designed to perform certain elemental operations 
(e.g., addition, Viegation, comparison for equality, internal 
movement of daia). These operational circuits are supplemen- 
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ted by special memory units (registers) used to hold the data 
pn which the ALU operates. The diversity und complexity of 
the$e operations can Vary widely among computer systems. 

2. A main storage unit (memory) subdivided into 'a prescribed 
number of equally sized elements (words),- each with its own 
unique permanent numerical address. Depending on the type 
of computer, there may be tremendous variation in memory 
size (number of words,) word size (the amount of information 
accommodated by each wor^l), speed (the time required to 
obtain and reproduce the contents of a specified location), and 
physical construction. However, the memory's functional as- 
pects transcend these* individual differences: it is a passive 
component, initiating no action of its own. Instead, it receives 
or delivers information on demand, Except for very special 
cases, this information is not characterized by its contents but 
rather by its location in storage. For example, the machine is 
not organized to print a Specified value; instead, it is designed 
to print the contents of a specified^storage element. ' ~ 

For example, the process C «- A-B (in which the contents of 
location C are replaced with a value obtained by subtracting 
the contents of location B from those of location A) would 
require a sequence such as the following: 

(a) Reproduce A's contents in tjie ALlJ's register (A's con- 
tents are unchanged). 

(b) Subtract from the valuo^ln the ALU's register an 
■ . amount equal to that found in location B (B's contents 

are unchanged). 

(c) Reproduce in location C the value frojm the ALU's 
register (C has a new value in it; the ALU's register is 
unchanged by step (c). 

3. A set of input/output mechanisms to handle the transmission 
of data between the computing system and the outside world.' 
Regardless of the number and diversity of input/output units, 
it is the function of this component to manage the flow of 
information to and from the processing system. 

4. A control unit which directs'the activities of the entire system. 
In the basic Von Neumann machine, instructions are taken 
from memory, examined, decoded, and executed one^U a time. 
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Accordingly, the control unit is equipped with a counter that ■ 
- keeps track of the main storage location whose^contents rep- 
resent the next instruction, and a register into which that 
I instruction is copied for subsequent decoding. In this context, 
the essential function of the control unit can be represented as 
V an iterative cycle of the sequence shown below. For con- 
venience, let us assume that the next Instruction to be execu- 
ted is stored in memory address n: ••/., " ' : : : $^-,/. 

(a) The instruction stored in location n is copied into the 

appropriate control unit register. 
■(b) The control unit's counter is incremented to n + 1 (the 

location of the next sequential instruction). 

(c) The control unit decodes the instruction, thereby deter- 
mining the operation to be performed and the word of 
main storage involved in thafc-operation. The repertoire 
of operMbn types, each represented by its^own unique 
numerical code, comprise the machine language for a- 
given type of computer. As an integral part of the de- 
coding process, the' control unit activates those circuits 
in the arithmetic/logical unit required to perform) the 

■r" activities corresponding to that operation. 

(d) The ALU is activated, thereby executing the specified 
operation. \\'/-.- >■ " 

j»(e) Once thg offeration has been completed, tfie control 
/ unit resumes its basic activity beginning again with step 
(a) above. 

Thus, the instructions are executed in strict sequential order. ; 
Changes Jin sequence are handted„by an instruction whose execu- 
tion causes the number(address) in the control unit's counter to be 
changed. Then, without any alteration at all in the unit's basic 
cycle, tfie sequential execution of instructions continues from, that, 
new point. This makes impossible to select an alternative sequence , 
of events dynamically, based on prevailing conditions at that in- 
stant. .-. • ■ f.-' , 

The Von^i^eumann architecture makes it possible to fulfillan- 
other basic function without compromising^ the fundamental sim- 
plicity of the control unit's xyfcle: Since a sequence of instructions 
can be used to obtaia-4he contents of any storage location, to 
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process those contend (i.e., change the value) arifl to place the new^ 
result in the location from whence the original* value came, the 
author of such a sequence could contrive to specify a location 
whose contents happen to be one of the instructions* in [he se- 
quence.. With properly specified manipulations, the resultsjcould 
produce a situation in which a particular instruction is executed 
and then, through a normal sequence of activities" that instruction 
is replaced with a modifijd operation. A change in sequencing (as 
outlined above) forces the control unit back to execute again from 
that location whose contents underwent the change.:,Thus, in jeffect, 
the .storage of instructions and data in the^same memory immedi- 
ately implies the availability of a self-modifying inechanism- whose 
flexibility is limited only by the user's perception of how to (exploit 

EDUCATIONAL BEGINNINGS / . 

, -v.- « ■ . > : ■ ■ yy .. :•(• ■ 

v- Regardless of the "true"chronolcJfey of concepts or the ambiguity 
surrounding the proper assignment of credit for "firsts" to thk right 
individuals, organizations, or countries, electronic digital com- 
puters clearly were production items in the United States by the 
mid-1950's. Well over a thousand systems were 1 in use and a 
number of companies were seriously committed to the positional 
scramble within the burgeoning industry. (IBM was just beginning 
to. assume preeminence.) This growth, anything but systematic, 
produced an increasingly acute shortage of personnel prepared to 
deal with computers and their use. Manufacturers tried to provide 
support for thdir customers by implementing their own training 
programs, but the effort fell short as growth continued to acceler- 
ate. ' - ■ \ 

A possible solution was Seen in the university computer labora- 
tories. Although \he major design and manufacturing work was 
shifting to industrutl^settings, activity in the laboratories did not 
halt. Some maihtained^h^rresearch/design programs (with several 
laboratories under manufacturers' subsidy); others redirected their 
emphasis toward developing and supporting new applications. This 
latter orientation brought into sharp focus the desirability of pro- 
viding a computing resource for use across an arbitrary spectrum 
of university research and administrative activities. Prompted by 
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the increiasing availability, of "ready-made" computers, more and 
more universities established such laboratories (primarily) as ser- 
vice facilities, inevitably including instruction in computing and , 
programming among their functions. While much of this instruc-* 
tion was informal, intended to support internal users, ^ many 
universities were quick to recognize ah intrinsic educational re- 
sponsibility to ^the rapidly growing population involved \ with' 
computers. ^ \ 

By 1956 well oyer a d6?en American universities had compu- 
tation laboratories equipped with electronic digital computers and 
organized attempts to characterize these responsibilities were well 
under way [6]. As would be expected in any new pursuit, there was 
great v diversity in people's perception of the nature and extent of 
the educational needs: For example, some saw a relatively clear 
dichotomy in which electronics engineers design, build, and mainr 
tain computers, and experts in their respective fields (i.e., astron- 
omers, mathematicians, accountants, meteorologists, and so dn)\ 
identify, design, and implement applications. Accordingly, the lab- A 
oratory's educational duty, in this context, would be to provide \ 
training in how to write programs and use the university's facility. 
Other views were predicated on the need for a new type of person 
ranging from an "analyst" schooled: in mathematics, electrical 
engineering, or business and trained in computers on the job, to a 
"computer expert" produced by a vocational institution, to a 
"mathematical engineer," the latter resulting from a graduate level 
curriculum within mathematics. Still another view held that every 
student headed for, a career in science or business should receive a 
course in computer^ (either from the mathematics department, elec- 
trical engineering, or the computer laboratory). 

Some institutions began to. provide a framework for students 
interested in computer work, with the tendency being to provide 
specialized courses in computing and programming on top of a 
prescribed group of standd^d offerings in applied mathematics. By 
1958 several universities, through &§ir numerical analysis centers, 
elecfrical engineering departments, or computer laboratories, were 
offering fellowships specifically earmarked for concentration in 
computer work, and many others routinely included one type of 
computer-related course or project in science, engineering, math- 
ematics, and business curricula. The center of activity, though, still - 
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was the computer laboratory, Which continued to become an in- 
creasingly recognized source of trained personnel for industry and- 
commerce. 

As a result, computer manufacturers stimulated the growth of 
university computing with the same zeal and determination that 
characterized their campaigns for nonacademic customers. A 
number of companies instituted liberal discount policies, thereby 
facilitating the establishment of centers throughout the country. At 
the same time, university people interested in the structure, admin- 
istration, and impact of such centers were encouraged to band 
together for cooperative study and discussion of these issues. (In 
many caSes the "encouragement" was more substantial, taking the 
form of a quid pro quo in which the implementation of computer 
courses was a precondition for installation of a'heavily discounted 
machine.) The federal 'government, also encouraged formation of 
university computer laboratories via institutional grants. 

By I960 about 200 colleges and universities were equipped with 
digital computers, and the general infusion of computer usage into 
the educational process was well established. Processes for devel- 
oping computer programs were facilitated greatly by the;, introduc- 
tion of high-level languages, which brought the coding ofrpr^gj-ams 
closer $o human tfefms; The iise of a program (compiler) to mediate 
between the high-level language and that required by the machine 
was pioneered by Grace M. Hopper. As a naval officer she played a 
significant role in Harvard's Mark I project. After World War II, 
she implemented* many of her ideas about language processors in 
her position as senior mathematician with the Eekert-Mauchly 
, ComputffFfcorporation. 

Use of computing facilities, was simplified further by a variety of 
convenient software products for creating, exploiting and maintain- 
ing program libraries. Many of these supportive programs operated 
beyond the users' perception and were, in effect, "invisible." For 
example j a numerical integratipn program coded in the FOR- 
TRAN language could be submitted in that form (along with data) 
to a computing system. Without further huipan intervention, the 
system would deliver final results (i.e., integral values):.; Unless ex- 
plicitly informed otherwise, the user easily could sustain the illusion 
that the numerical integrator and ;the computer were the sole par- 
ticipants in that process. 



28 Seymour V. Pollack 

As these procedures and services developed, realization grew that 
here was a sizable body of knowledge closejy related to, but dis- 
tinct from, the applications themselves. The problem addressed by 
these concepts and techniques (e.g., "analysis and translation of 
languages, automatic generation of programs,v creation and man- 
agement of effective user-oriented working environments on com- 
puters) were far from trivial ^nd they grew increasingly important 
with the advent of larger, more complex, and more versatile com- 
puting equipmendt would be unrealistic to assign any degree of 
unity to this realization, or to associate it with some kind of coordi- 
natefl%eaetion to it. While the fruits of this technology were indis- 
pensable components of any successful' computing facility, there* 
was'nptfiing near a consensus regarding the characteristics of this 
tecjti&logy; opinions differed widely as to whether there was a 

^discernible "body of knowledge" here, and, if so, where" its proper 

/ Kome was. Some argued thai it was part of numerical analysis; 
others saw it as a newly, emerging area subsumed, respectively, 
under' mathematics, electrical engineering, industrial engineering, 
linguistics, library science, and business (among dthers). Advocacy 
for a new and distinct discipline still was very jsparse, owing in part 

jto the difficulties in ascribing some type of specific identity to this 
field. Some thing^were fairly well settled: Notably, the complexities ; 

' inherent in the design and implementation of effective general- pur- 
pose systems components sdch as language translators and other 
software products were sufficiently well appreciated to allow the 
general abandonment of the idea of vocational settings for such 

N instruction. Nevertheless, there were. fundamental questions regard- 
ing who teaches what to whom. 

A number of institutions, feeling that detailed operatiOjiaLand 
structural knowledge of systems software was central to the prep- . 
aration of an effective user of computers, sought to include such 
instruction by expanding the offerings in the department currently 
teaching programming and computing. The results, though dis- 
appointing pedagogically, helped underscore the: fact that many 
(probably most) users had little or no interest in the science and 
technology of developing software. (Users' similar lack of en- 
thusiasm fop equipment details already was clear.) Consequently, 
alternatives were considered wherein software specialists, now to be 
distinct from application* programmers, computer designers, and 
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computer users, were to align themselves with computer labora- 
tories, under whose auspices such training could be formalized. 
: Another mod^l was based on reorganization of computer-related 
studies into components of a new administrative unit (e.g., an 
interdisciplinary program or department in which ^computer de- 
sign/engineering, numerical analysis* computer applications, etc., 
could be divisions, or even a school in which they could be depart- 
ments). These were implemented, in oi^e form or another, 
by some universities. For example, by 1961 Carnegie-Mellon Uni- 
versity (then Carnegie Institute of Technology) had an interdepart- 
mental doctoral program in computer systems and communi- 
cations; Stanford University had* a computer science division 
within the mathematics department and the University of Wiscon- 
sin set up a numerical analysis department.^iowever, most institu- 
tions remained cautious, adding individual courses as the market 
dictated. . • * . " 

Matters were complicated by individual identity problems, even 
among major practitioners. The following (hypothetical) situation 
was typical: A pharmacologist needing some machine , compu- 
tations found that, for his situation, it would be most expedient for 
him to implement the applications himself. After completing the 
computer laboratory's informal programming course, he developed 
and coded a program which, eventually, did what he wanted and 
the project was concluded successfully. In the process of testing and 
refining the program, the pharmacologist had learned much about 
the laboratory's operation. (He even may have written a utility 
program in support of his project and then generalized it for incor- 
poration into the laboratory's resources.) As a result* he found 
himself spending more and more time helping others with their 
applications and with the logistics of laboratory use. After a while 
this became his major activity, and the laboratory formalized the 
process , by adding him to its staff. Yet he continued to think of 
himself as a pharmacologist. 

Other, jnore fundamental factors also impeded crystallization of 
academic frameworks for computer-related instruction. Despite the 
^mushrooming growth- of computer usage, many people (among 
them those with heavy involvement in programming snd appli- 
cations) still held a vague and narrow perception of what a 
computer is. While there was a rapidly expanding literature on 
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equipment design, applications, and computational techniques, 
relatively, little o£ it addressed itself to more reflective aspects of this 
new area. This, coupled witti the primary motivations for early 
computer development, made it "very- easy to propagate and rein- 
force the contention that a computer is a fast, powerful, mathemat- 
ical tool, a "super slide-rule." Accordingly, it was argued, pertinent 
instruction should emphasize the design and maintenance of these 
machines, the techniques for programming them, and the selection 
of computational processes best.suited for implementation on com- 
puters. ^An indication of« the roots for this perspective can be seen 
by the ^act that it was not until the UNIVAC was developed that a 
digital /computer was equipped to store and display nonnumeric 
characters. Moreover, numerous machines that were built subse- 
quent to the UNIVAC were restricted to handling numeric data.) 
Further encouragement for this outlook came from computer 
manufacturers who actively propagated the myth that "scientific 
computing" and "business computing" were inherently different ac- 
tivities requiring different ^approaches and different types of ma- 
chines. (The former endeavor was characterized 5 by \ arbitrarily com- 
plex computations on relatively small amounts- of data; the latter 
involved trivial computations on arbitrarily large amounts of data.) 
As user sophistication grew and the spectrum of applications 
broadened, it became increasingly difficult to defend this over- 
simplification, and one rarely hears it nowadays. 

A second factor which retarded movement toward arfeduca^ 
tiorial identity was a notable dearth of "science" to go along with 
the rapidly growing technology. It was easy enough to identify a 
collection of very useful and clever techniques for improving 
various aspects of a computer system's effectiveness. In certain con- 
texts, it was possible to point to sets of general precepts which were 
emerging as foundations for areas of applied; work. For example, 
understanding of high-level programming languages and their 
translators . in the early 1960S enabled such objects to be 
designed with a considerable amount of determinism, in contrast to 
their weed-like predecessors. However, there were no major 
phenomenological frameworks that served to unify and- organize 
the tremendous amount of empirical information, that already had 
accumulated. The people who were claiming existence of a separate 
discipline called "computer science" or "information science" were 
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hard pressed to identify the characteristics or cornerstones of such 
a science. (One finals references to such puzzling concepts as the 
"theory of applications.") 

Certainly, such difficulties were understandable. /Unlike most 
other areas of inquiry, there was no natural arena such as an atom, 
tissue, or crystal lattice to serve as a source of observations. In- 
stead, the "universe" of interest was an- artifact barely a decade old; 
there was no direct heritage of contemplative structure to which 
the newly acquired observations could be reconciled. In these cir- 
cumstances it was natural for seekers of academic respectability to - 
fill the void with existing material that was strongly related -to the 
emphases; at a particular institution. Thus/ at a university with 
strong commitments to engineering compyter applications and a 
desire to formulate an educational program, the scientific aspect of 
such a program would be organized around relatively extensive 
offerings in numerical analysis. The more adventurous institutions 
introduced organized efforts to analyze and compare numerical 
algorithms with regard to their suitability for coriiputer implemen- 
tation; others taught numerical analysis unaffected by computers, 
leaving it to the old hapds in the laboratory to provide interested 
'students with the tricks of the trade. Similarly/ an institution em- 
phasizing computer design and construction would place numerical 
analysis in a less dominant role hi favor of Boolean algebra, switch- 
ing "theory, and mathematical logic. Here again, depending on the 
individual program, the material was augmented witl£ explicit con- 
nectors to computer technology, or questions of applicability were 
left for the student to discern. Regardless-' of the extent to which 
such dovetailing was attempted, the "pure" subjects generally were 
not treated as "borrowed" disciplines which would serve until the 
"real" computer science matured. Rather, there was genuine convic- 
tion that switching theory, or numerical analysis, or documentation 
theory (or whatever), was the proper nucleus around which com- 
puter science would develop and grow. As it turned put, this made 
it a little easier/to provide a still new and amorphous area with 
some semblanoe; of "tradition." A closer look at this notion will be 
' -of interest; pfe;;?^* 

As discussed earlier, one can point to a, collection of compu- 
tational/and logical devices that span over three centuries and 
connect such devices conceptually to electronic digital computers. 
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In the same general sense it is possible to identify certain contri- 
butions in mathematics, logic, and philosophy ras "spiritual 
forerunners" of computer science. While this may provide a com- 
fortable feeling of continuity,, it really does not contri&jite substan- 
tially to our current perception of computer science and its central' 
issues. That does not deny all connections; far from ,it. For in- 
stance, the importance of Boolean algebra to computer technology 
and, subsequently, to computer science is beyond dispute. But 
Boolean algebra has not become an aspectof computer science (or 
vice versa). Rather, the precepts and techniques of Boolean algebra 
have become part of the collection* of indispensable vehicles used in 
computer science in the same way that partial differential equations 
have been crucial to the study of astronomy or ballistics. As sepa- 
rate computer science programs began to form, in the early sixties, 
the perceive^ need for academic underpinnings made it especially 
easy for many to adopt these disciplines, construing them as central 
issues. Many of the ^subsequent evolutionary changes in computer , 
science education can be related to an increasing' awareness of the 
conceptual distinction between the science itself arid the expressive 
and manipulative vehicle required for its pursuit. 

Despite this jockeying for tradition, it would be misleading to 
say that computer usage, was developing with no science at $11. 
Considerable theoretical work was in progress and some of it pro- 
vided the roots for today's substantial and*relevant body of know- 
ledge. For example, the complexity of 1964's state-of-the-art com- 
puting systems necessitated the implementation of powerful execu- 
tive software structures (operating systems) whose cost began to 
rival (and soon would exceed) that of the hardware. Effective use of 
the^equipment called for a multiprogrammed organization, wherein 
a number of independent (and probably unrelated) programs would 
be in the system at a given instant, each on its way to completion 
ahd each contending for some subset of the system's resources. 
Proper management of the .resources and mediation among the 
contenders without inflicting excessive overhead called for an inte- 
grated system organization far more sophisticated than earlier ex- 
ecutive programs. In response to these demands, theories began to 
emerge for ' characterizing such systems, analyzing their per- 
formance, ancf predicting behavioral effects wrought by specified; 
changes in operating parameters. Such exploration draws heavily 
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on the techniques of operations research and statistics, but it is the 
study of operating systems, reinforced by operating systems theory, 
that is peculiar to computer science. ; 

At about the same time, there was a growing realization that the 
process of preparing, developing, and testing programs was subject 
to some degree of systematization that could raise such endeavors 
to a more consistent level. Eindhoven Technological University's 
E. W. Dijkstra began to identify coherent' principles which 
characterize sound program structure and form a basis for eventual 
formal verification of a program's correctness. Work stemming 
from these important insights has resulted in. new programming 
languages (and extensions to existing ones) designed to facilitate 
the application of these principles of structured programming. The 
effectivenejss of these* precepts has been demonstrated repeatedly iri 
terms of less costlyi more reliable software products. More signifi- 
cantly, these ideas have precipitated a fundamental change in atti- 
tudes toward program design and assessment such that the entire 
cycle of implementing algorithms, a process central to computer v 
science, has lost a good deal of its craftlike character ,and is ap- 
proached in a much more disciplined manner [7]. 

Other areas of concern to computer science began to coalesce in 
the same general way, driven by operational problems stemming 
from new, equipment technologies and more challenging appli- 
cations. Yhe resulting infbrrhation-handling processes began to 
overtax. the ad hoc mejthodoloi|fe!5 usi^d tp analyze simpler se- * 
quencesof events, and m^i^icom^ began to 

appear.-. ■ ... 

1 .a ' - . 

THEORY VERSUS PRACTICE 

At the same time another trend was taking shape. The rapid 
growth of computer science education stimulated«increased interest 
in theoretical areas (such as automata theory and formal^ 
languages) whose pursuit predated computers. Now, these areas'^ 
were seen potentially to impinge on questions raised by the design 
and use of computer system?. Consequently, there appearied to be ^ ; 
prospect of concurrent and mutually nourishing development in 
computer scieiice theory and practice. Curiously, this did not 
happen. T^e newly intensified effort generally maintained its own 
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paths, interacting very little with the application-motivated prob- 
lems that were helping to spur headlong advances in hardware and 
software technology. ^ , ; ^ , 

A brief look at the early role of automata theory provides an 
interesting example of this situation. Mathematical logicians had 
been concerned for some time with classes of computable numbers 
and exact procedures for obtaining them. In 1936 these studies 
received tremendous impetus from Alan Turing and Eimil Post. 
Working independently, each devised an abstract machine (an 
automaton) in which the outcome ( a computable number) could be 
represented as a_^equence of ones and zeros, and the e^act pro- 
cedure for producing that outcome as a sequence of well-defined 
primitive actions. Moreover, Turing was able to show that it was 
possible tp specify a "universal'* machine of this character such 
that ;it 'c6ul<J duplicate the results of any particular automaton, 
even those producing arbitrarily complicated sequences. Thus, if a 
number was pomputable, it could be a computed on a universal 
Turing machine. Both Turing and Von Neumann were acutely 
aware of the applicability of these results to the description,, 
characterization, and analysis of automatic computers. (Turing de- 
clined an offer to become Von Neumann's assistant and went to 
head England's Automatic Computing Engine project.) 

Once the idea of teaching the "science of computing" began to 
gain momentum, many people ^'t^to^^Cability to formulate 
automata as abstractioris-of con^ute^^^S^^^yide the basis for 
an . increasing flow of ideas and results betw^n tl^ory and practice. 
• Accordingly, automata theory became a mainstay *of many early , 
graduate programs in computer science. 

^•H.owever, the crossflow did not occur. Workers engaged in de- 
viling coherent structures for the growing mass of observations 
obtained from computer-related processes hoped eventually to ex- 
ploit the models and insights that would flow from automata 
theory. Meanwhile, it was important to teach the theory on its own 
for its/rich cultural .yalue (and it was respectable too). However, 
work-in automata continued within its own context, basically un- 
disturbed by newly focused attention from computer science pro- 
grams. It was not until well into the 1970's that the pathways of 
automata theory and computer science began to converge, and 
now the impact of one on the other is notable and useful. 
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This metamorphosis of computing/computer technology/ com- 
puter science was observed closely by a number of organizations 
besides the academic institutions themselves. (We already have 
mentioned the government and computer industry.) In addition, 
this process was of great interest to professional societies, promi- 
nent among whom was the Association for Computing Machinery 
(ACM). Formed in 1947, the ACM had assumed a central role in 
encouraging the evolution of computer science as a distinct field of 
study. By 1963 this concern took form via ACM's Curriculum 
Committee whose first recommended program in 1965 [8] rep- 
resents one of the earliest attempts to produce a coherent definition 
of computer science's major concerns. In a direct predecessor to 
this curriculum report, T. S. Keenan identified four such areas .£9]; 

1. Organization and interaction of equipment constituting an information 
processing system The system can include\both machinery and people, and 
its organization will be influen^d by the environment in which it is em- 
bedded. 

2. Development of software systems with which to control and communi- 
cate with equipment. . .. 

3. Derivation and study of procedures and basic theories for the specifi- 
cation of processes. . . . : *> 

: 4. Application of systems, software, procedures and theories of computer 
science to other disciplines. 

• i * 

This report was pivotal in several ways. Besides reaffirming the 
idea of computer science being p a distinct area of study, it gave 
primary emphasis to the implementation of a computer science 
curriculum as a distinct mathematical entity, with its own majors. . 
(Similar combinations of subjects were being suggested at about the 
same time by the Mathematical Association of America's Com- 
mittee on Undergraduate Programs in Mathematics (CUPM) and 
by the National Academy of Engineering's Committee , on Com- 
puter Science in Electrical Engineering (COSINE) as areas of con- 
centration for majors in mathematics and electrical engineering, 
respectively.) Moreover, while the importance of numerical appli- 
cations was emphasized (Table 1), there was a tentative but none- 
theless explicit effort to depart from the self-limiting notions that 
computers are numerical instruments and computer scientists are 



Tabu I 



Recommended Curriculum for Computer Science Majon 



\ Courses 


■ 


Computer Science' 






\ ' 
RecomX 


Basic Courses 


Theory Courses 


i i 

'/ - | i 
Numerical Algorithms 
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Supporting 


mcndalionsX 


— u — ii_ 






and Applications 

rr 




Required 


\ 

1, Introduction lo , 
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Programming 




, Analysis II . " 




Equations 
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,10, Combinatorics 
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Eleciives 
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H, Introduction 
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12, Mathematical 
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Optimization 
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Theory 


1 


Techniques 


Linguistics 
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IJ, Formal 


1 
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Philosophy and 












Philosophy of 
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Science 
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"superprogrammers" adept at implementing numerical algorithms. 
The effort was tentative because its curricular uncertainties still, 
were strongly evident: The primacy of algorithmic processes and 
languages was clearly established (both are, required). At the same 
time, there was an apparent hesitancy in giving up familiar com- 
forts, and so calculus and linear algebra were required. (The 
numerical courses are shown as computer science courses to em- 
phasize their orientation toward computer usage.) 

While the recommendations were not universally adopted, they 
exerted great influence as a catalyst, prompting accelerated activity 
in curriculum development throughout the'eountry. A rough index 
of this growth is obtained>by comparing the number of United 
States undergraduate degree programs in computer science in 19^4 
(about a dozen), with thdse of 1968 (close to one, hundred). A simi- 
lar-increase is seen at the masters level, and about a fourfold in- 
crease (from about 10 to about 40) at the doctoral Igyel. Less 
evident but still present was the additional growth that occurfed 
within parent departments. 

While the ACM's curricular recommendations had some unify- 
ing effects, growth in computer science educatibn'still continued to. 
be turbulent, pulled in many directions by institutional differences 
and diverse perspectives. Even where there was agreement that 
computer science should stand by itself, there was controversy over 
its placement, If the computer laboratory's key builders were math- 
ematicians (as was true in most cases), the emerging computer % 
science department took shape as a mathematical entity, housed ia 
the school of liberal arts. On the other hand, predominance of 
engineering usage prompted the establishment of an engineering 
department, with the curriculum's contents tending to be more 
pragmatic. A third approach sought to emphasize the interdisci- 
plinary (or pandisciplinary) impact of computing by constructing a 
curriculum in which basic courses were to be supplemented by a 
somewhat arbitrary collection of offerings dealing with computers 

in ' ■ ' 

Another, more specific aspect of this turbulence is seen by the 
fact that there was virtually no consensus on the structure of an 
introductory course. Sufficient insight already had developed so 
that/the more progressive institutions agreed on what such a course 
should not be— a cookbook course in high-level language coding— 
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and on the fact that it should place major emphasis^on the study of 
algorithms and. algorithmic processes. Beyond that; opinions di- 
verged on how these concepts should be imparted, with approaches 
Vrangjng from pencil-and-paper conceptual models through spe- 
cially designed pedagogical programming languages to the use of 
existing languages/ Morepyer, there was wide , disagreement on 
computer , science's service role. At one end. of the spectrum there 
was advocacy for a universal introductory \course (like General 
Chemistry I); at trie other end, some favored fragmentation (more* 
like. statistics) where each department would either provide its own 
introductory course or send t itsJ students to the computer science' 
offering (intended for its own majors but open to i others). 

At a more fundamental level, many universities, while convinced 
of computer science's separate identity, felt that an. independent 
program was premature. For them, computer science^wasYa gradu-' 
ate specialty to be preceded by undergraduate. concentration in 
some established area (not necessarily mathematics or science). 

The pressures and experiences generated by this explosive 
growth helped accelerate the refinement of ACM's preliminary, 
undergraduate curriculum so that a fully developed version ap- 
peared less than three years later [10]. Even in that brief interval 
some important conceptual processes matured artd, because ofthis,^ 
Currigulum *68 (as it became known) stands as an important land- 
mark- for computer scierice education, perhaps in the same sense 
that \Vbn Neumann's IAS machine seryes as a piilestone for con- 
ceptual computer design. Philosophically, Curriculum '68 marked 
'the end of debate regarding the separate existence of computer 
science. Moreover, it clearly^ placed such occupational areas as 
computer operations, coding, and data preparation outside the 
realm of computer science. Se&§$ 

The perception of computer science itself underwent impo/fant 
organizational shifts: In a very fundamental reorientation Cur- 
riculum '68 identified the representation, structure, and transform- 
ation ;of information as a major focus, conceptually dissociated 
from specific computer systems* or appljcations. 

Consistent with this outlook, hardware; and software systems^ 
perceived as 'separate areas in th^ earlier recommendations, were 
redefined in a single framework, i.a, systems capable of transform- 
ing information. This also reflected a movement . in practice toward ^ 
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the unification 7 of hardware, and software design necessitated by the 
' increased ^capability of new equipment Effective exploitation of 
such hardware now began to require integrated design of a cpm- 
puter systeim^ hardware and software instead of superimposing the 
latter on the former, 

Another shift developed with increased recognition of common 
methodological threads running through computer usage. As a 
^result, a third major computer science focus was articulated, cen- 
etered around! the, identification ahd i^islbpmeni of methoddlogies 
■ derived from, applications with common, processing characteristics, . 
irrespective of the intri^i^ relations among them. Thus, for exam- 
ple, the area of computer graphics centralizes techniques distilled 
from (and useful for) visual display contexts as diverse af medicine, 
geography, stress analysis, and textile design .V y -\ . 

ThQ conceptual division of computer science into these three 
areas was supported in Curriculum '68 by a fourth category en - 
compassing a wide collection of disciplines involved with com- 
puters. (The content^ of a given collection, of course, s would be 
dictated by conditions at the particular institution.) \ 
; : These perspectives were molded into the fully developed core 
curriculum summarized in Table 2. Gomputef science offerings are 
grouped into basic (B), intermediate (I)* and advanced (A) levels,, 
commensurate with academic background and maturity. The role 
... of the first two areas (iriformation structures and information pro- 
cessing systems) is emphasized by requiring all students to take the 
coursed add jessiifg those areas. A^more flexible , approach is taken 
to the third area (methodologies), with the student selecting those 
.courses most congruent with his interests and objectives, 
{ r As implied in Table 2, Curriculum '68 solidified an earlier con- 
tention that computer science is primarily a mathematical ehdeav- 
' v qr and its praotiiioners can .be expected to . engage in work 
requiring predominantly mathematical processes. This was instru- 
' mental in producing a clear and- consistent .pic^re df computer 
science's position in a school of arts and, sciences, eithWs a separ- 
ate department- or as 'a distinct but integral/ part of a broader 
mathematical environment. As a result; it provided a very useful 
source of inspiration, serving as a guide for . the fonnation of nu- 
merous new department^ as well as for the reorientation of existing 
z ones. . \ . \ ', ,' •'/'; *\ ' ;••>'•• 
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ACM Curriculum '68 Core Cojirses. 





Computer Science Courses 


Mathematics CoursesJ 


Basic Courses 


Bl. Introduction to Computing* 
B2. Computers and. Programming* 
B3. Introduction to Discrete 

Structures* : v 
B4. Numerical Calculus* 


Ml. Introductory 

Calculus* 
M2. Mathematical 

Analysis I* 
M3. Linear Algebra* 
M4. Mathematical 

Analysis II* 


Intermediate 
s Courses 


U. , Data Structures* 

12. Programming Languages* 

13. Computer Organization* 

14. Systems Programming* 

15. Compiler Qmstructionf 
*I6. Switching Theoryt " 

17. Sequential Machinesf 

18. Numerical Analysis If 

19. Numerical Analysis lit 


M2P. Probability* 
M5. Advanced 

Multivariable 
, Calculusf 
M6. Algebraic 

Structuresf 
M7. Probability and 
Statistics f 


Advanced 
Courses 


Al. Formal Languages and 

Syntactical Analysis 
A2. .Advanced Computer 

Organization 
A3. Analog/Hybrid Computing 
A4. System Simulation 
A5. Information Organization 

and Retrieval 
A6. Computer Graphics 
A7. Theory of Computability 
A8. Large Scale Information 

Processing Systems 
A9. Artificial Intelligence 

.,■ and Heuristic Programming 





'* Required. . t- * 

t At least two from each of the mathematics and computer science groups. 
o t Based on C.UPM recommendations. 
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Interestingly, Curriculum '68's influence also had a dichot- 
omizing aspect: Its basically mathematical orientation sharpened 
its contrast with more pragmatic alternatives. Most computer sci- 
ence educators agreed* that the proposed core courses included 
issues crucial to computer sdeiice. . However, the curriculum 
brought to the surface a strong division over, the way in which 
these -issues should be yiewed. In defining the' 'contents, of- the 
courses, Curriculum '68 established clearly its alignment with more 
traditional mathematical Itudies, giving r primary emphasis ■. to a 
search for beauty and elegance. Pedagogically, this implied a set of 
academic- objectives concerned , chiefly with .preparation for gradu- 
ate study leading to a career in research. Consequently,.those col- 
leges and universities holding with this perception of computer 
science saw Curriculum -68 as a reinforcement and endorsement of 
their orientation and sought to implement it commensurate with 
their resources. 

On the other hand, many educators felt the curriculum to be at 
odds with their perception of reality. They argued that the uses of 
computer science and the observed roles of computer scientists 
militate for an educational approach much closer to that used in 
professional disciplines. After all, the ultimate outcome of most 
computer science endeavors is a tangible product *(an efficient 
language processor, chemical process ^ntroller, graphic display 
vehicle, sales analysis system, niedical diagnostic aid, or other infor- 
mation processing system) whose primary use is likely to be outside 
of computer "science. The computer'.sciencefthat underlies such a 
product will be invisible to its users~or to its operation. How that 
science was applied, i.e., the way in which the product was en- 
gineered, also will be beyond the user's perception but the effect 
will be more direct, manifested in terms of the product's cost* per- 
formance and reliability. In this light, computer science education 
should have a strong professional flavor (it was argued), with 
design principles, general approaches to problem solving, and ex- 
periments with current methodologies receiving considerable atten- 
tion. This would be consistent with the expectation of professional 
employment starting at the baccalaureate level. Another, related 
objection pointed out Curriculum '68's neglect of business and 
commercial data processing, a set of general areas motivating the 
bulk of the hardware and software industries. 
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Thus the controversy was not merely a "conflict between "theory" 
and "practice." Rather, the dispute pivoted around the definition of 
"proper" theoretical material and how closely that material should 
be tied to actual problems experienced in the field. Strict adherents 
-to Curriculum '68 advocated continued use of material (such as 
formal language and sequential machine theories) pursued for its 
own ends in relative isolation fv^jn computing contexts. In addition 
to their innate .cultural value, such established ahd respectable pur- 
suits would continue to lend credibility to the idea of computer 
science. In this view, computer applications should be picked up> 
elsewhere. (A considerable number of educators favored a cur- 
ricular model irt which computer science would be taken as a joint 
major with some other discipline; pthehJelt practical knowledge is 
best acquired on the job either after gntehiation or via a work- 
study arrangement.) Moreover, the "relevant? theory, engendered 
by problems encountered in practice, would be an unsuitable re- 
placement because it still--, lacked maturity and coherence. Op- 
ponents of this view \ emphasized /the importance of establishing a 
continuing interaction between theoreticians and practitioners, 
contending that only in this way would it be possible to realize the 
Unfulfilled promise of a continuum from theory to applications. 

As a result, computer science growth continued with no decrease 
in turbulence. Even when basic direction was not an issue/ there 
were problems with implementation. Numerous attempts to install 
a program based on Curriculum '68 were impeded by its size or by 
the difficulty in staffing it with qualified faculty. Others found that 
employers, unable to exploit the background acquired in such a 
program, would riot hire its graduates. On the other hand, efforts 
to implement a more practically oriented curriculum often went 
awry because of territorial disputes between computer science and 
some other area (electrical engineering, mathematics, business, etc.). 
In some institutions this forced abandonment of the ideia of a sep- 
arate computer science department with the consequent distri- 
bution of areas among existing units. Others, feeling that no one 
program could handle the spectrum of concerns effectively, set up 
complete programs in more than one department, each with its 
own orientation and its own majors, (The resources required- for 
such multiple coverage ruled it out for all but a relatively small 
number of instances.) ;tr ; 
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In response to this turmoil, alternative curricula began to 
appear, each intended to answer some class of objections raised by 
Curriculum '68. A major effort in this regard was a management- 
oriented curriculum in information systems stressing the informa- 
tion structures side of computer science, with additional emphasis 
" on/systems analysis, project management, human communication, 
and organizational concepts. Curricula also appeared in software 
engineering; biomedical computer science, information science, ap- 
plied mathematics (with emphasis on mathematics of computation), 
computing center management, and computer engineering. The 
latter term still causes extraordinary confusion in that it evokes a 
mental image of involvement with computer Hardware that is arbi- 
trarily related to the degree of actual emphasis in a given program. 

This situation tempts the .conclusion that Curriculum '68, despite 
its solidification of the mathematical viewpoint, aggravated an al- 
ready existing state of chaos in computer science education. How- 
ever, there are overriding effects which secur& the curriculum's 
place as a major force in computer science's formative period. First 
of all, it provided a definite focus for discussion and response, 
thereby initiating the demise of the ad hoc approach to curriculum 
development in this area. Thus, while many (probably most) com- 
puter science departments (or institutions contemplating such de- 
partments) objected to something in the curriculum, .it became a 
reference against which extensions, contractions, replacements, re- 
arrangements, and other "improvements" were formulated and pro- 
posed. Moreover,, almost everyone interested in computer science 
education found something in the curriculum not to object to. As a 
result, various aspects of the curriculum were emulated in many 
institutions having fundamentally differing viewpoints. The point is 
that, despite the diversity of vantage points, there was considerably 
more consistency with regard to the areas of major concern to 
computer science. '» 

In retrospect, Curriculum '68 was an effective catalyst for intensi- 
fying this debate and nudging it in two fairly definite directions. 
The ACM, harboring no illusions about the permanency of Cur- 
riculum '68, remained a central participant in American curricular 
activity. Through its Curriculum Committee and Special Interest 
Group on Computer Science -Education, it provided a continuing 
fprum for exchange of ideas dealing with the full range of curricular 
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concerns. COSINE and CUPM also remained active, .continuing to 
examine the role of computer science within electrical engineering 
and mathematics departments, respectively. 

The decade since .Curriculum '68's announcement hasten the 
accumulation of a tremendous amount and diversity of educational 
experience. This, coupled with compelling advances' in technology, 
new and increasingly pertinent theoretical findings, and feedback 
from a rapidly widening base of employers, has exerted continuing 
pressure on curriculum designers and developers. A sizeable litera- 
ture built up on a wide spectrum of topics, including form and 
content of individual courses, laboratory support for computer sci- 
ence, core sequences, service responsibilities, and entire curricula 
[11]. One also began to see articles of a more introspective nature, 
dealing with the direction of academic, computer science research, 
occupational outlets for doctoral graduates, and other more con- 
templative aspects of computer science. Much of this writing was— 
specialized, concentrating on specific matters in a carefully pro- 
scribed context (e.g., implementation of a particular piece of soft- 
ware for classroom use, selection of a laboratory computer, logis- 
tics for incorporating outside problems in a class on applications). 

This hectic activity was considerably less, haphazard than its 
written products might imply. Disputes and arguments not- 
withstanding/computer science in the early 1970's was an emi- 
nently viable area. A crude but nonetheless interesting indicator of 
its vigor was the fact that many institutions could claim recurrent 
employer acceptance of their computer science graduates at all 
levels. Moreover, scrutiny of computer, science programs (especially 
at the undergraduate and masters levels) outside the perspective of 
local course differences, choice of teaching language, etc., reveals a 
coalescent effect that makes it possible to characterize many of the 
camputer science ; programs that have diverged from the Cur- 
riculum '68 model: 

1. Computer Science has a strong professional orientation, 
«■ drawing much of its motivation frpm practical ^foblems and 

V providing a population of workers uniquely ^iquipped to 
address these problems. 

2. Computer Science has an indispensable experimental aspect. 
;• That is, the computer's role extends well beyond its basic use 
, as tjie vehicle on which* application algorithms (expressed as 
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programs) arc implemented. In a very real sense, it is a crucial 
■ - ^ laboratory for experiments whose purpose/ distinct from a$y 
application, is to enhance 4he understanding of information- 
processing phenomena. 
3. Given the two premises stated above, there is , no one cur- 
riculum that appears to be substantially more eflecnve than 
others in providing the 4< proper" growth environmentlfor pro- 
fessional computer scientists. Thus the choice between iticor-. 
poration of a professionally directed computer science, cur- 
riculum within another department or establishment of a new 
administrative unit would appear to be dictated largely by 
university politics. # 

The ACM, in dealing with curricular evolution, has become in- 
creasingly sensitive, to the accelerating growth of professionally fo- 
cused computer science programs. Accordingly, the organization's 
second major curricular framework' [12] reflects a substantial shift- 
in that direction. The report identifies a combination of knowledge 
and skills considered to be essential for all computer scientists 
regardless of the exact curriculum in which these are acquired : 

1. The ability to produce correct (operable), clear, well- 
, documented programs. 
•~ 2. The ability to assess the structural quality and computational ^ 
efficiency of the program. 

3. Background in the applicability of computer techniques to the 
solution of certain problems. 

4. Background in hardware system architecture *and component 
behavior in preparation for configuration analysis and hard- 
wkre selection. 

Superimposed on these attributes is the general requirement that 
all computer science majors coming through an adequate core cur- 
riculum should be sufficiently well grounded in algorithmic tech- 
niques, programming languages, hardware and software systems 
organization, and mathematical foundations to pursue advanced 
studies in- computer science and/or application areas of interest. 

Fulfillment of these general objectives is embodied In a series of 
eight computer science courses required of all majors. As shown in 
Table 3, these- CQurses 4 assume support from six mathematics 
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courses (four required) to provide mathcmaticalmaturity and ac- 
culturation. 

Once the 'core is assured, the new curriculum expects a flexible 
approach to the remainder of the program, with emphasis dictated* 
by local preferences. As indicated by the breadth of the optional 
computer science courses (CS 10 through CS I9"ah~d the~"topics"~ 
courses) the core may be complemented by conceptual work in a 
variety of directions. Thus Curriculum 77 (as this revision is calledjT 
has moved toward a more balanced program in which (pnedocto- 
ralj professional employment is an explicitly expected (and perhaps 
the predominant) possibility. Beside the practical orientation una- 
voidably obtained from the beginning courses, many of the pro-v 
posed higher level courses are split between lecture and laboratory 
sessions, thereby reflecting increased recognition of the laboratory's 
importance. 

CURRENT STATUS AND TRENDS 

Because of the widely different contexts, it is not particularly 
helpful to make a detailed comparison between Curriculum '77 and 
its predecessor. While Curriculum '68 served as a very useful point 
of departure and helped crystallize two basic alternatives for com- 
puter science education to follow, the revised set of recom- 
mendations reflects the reality of roughly 65 American doctoral 
programs in computer science and at least twice that number of 
undergraduate and/or masters programs, with a substantial frac- 
tion of these being professionally oriented. 

The overall stabilization of undergraduate computer science in 
an engineering cdntext is indicated by the recent (1977) inclusion of 
such departments as eligible candidates for accreditation by the 
Engineering Council for Professional Development. Consequently, 
it will be more interesting to consider the major issues that remain 
open as computer science education completes its second decade. 

It still is premature to ascribe to computer science a coherent set 
of principles that unify its major aspects. The beginnings of such 
structures are emerging as behavioral data obtained from com- 
puting systems and are being accommodated by more and more 
comprehensive models" which are catalyzing the formulation of 
more, effective design methodologies. Improvements in resulting 
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systems, assessed in engineering terms (e.g., shorter implementation 
times, lower software failure rates), have accelerated the transfer of 
this new knowledge to workaday contexts. The overall result has 
been a noticeable increase in computer science research engendered 
by problems encountered in practice, and an accompanying con- 
vergence of theory and applications. Of course the effects of this 
coWergence^wilWary widely among academic institutions. 

There has been no slowdown in the flow of problems. Success 
With a given application, coupled with growing insight into further 
'improvement, usually encourages a more ambitious undertaking. 
In many instances the concomitant increase in complexity stresses 
current models beyond their effectiveness, thereby necessitating 
further conceptual work. For example, there are numerous ap- 
plications in which the advantages of complex configurations in- 
volving multiple computers are easily perceived. Moreover, the 
construction of such configurations is well within current techno- 
logical capabilities. However, the behavior of information processes 
on many of these complex constellations is insufficiently under- 
stood to allow their systematic exploitation. 

This type of situation has a more fundamental aspect, stemming 
from the fact that the hardware revolution is far from dver. The' 
current phase, centered around microelectronics, is producing tech- 
nologies that can place the processing capability of thousands of 
ENIACs on a single silicon chip. Moreover, the cost would make it 
feasible to configure systems in which such a device is but a single 
cd^^ient replicated an arbitrary number of times. The excite- 
m^^Pjplied by this "silicon , miracle" is vitiated by the thought 
thMpjfpningful use of such computational power must be built on 
aininderstanding of how such systems go together,' the structure of 
information consistent with such systems, and the behavior of algo- 
rithms operating in such environments. Consequently; the pressure 
continues to mount for computer scientists t6 develop ways for 
describing and examining these complexes. It is highly unlikely that 
the ad hoc usage of sequential machines, initially built up as folk- 
lore, can be^epeated for these highly parallel systems. 

The consequences of this cultural lag already have raised educa- 
tional issues sharpened by' Curriculum '77. Perhaps the most con- 
spicuous of these is the one stemming from the rapid blurring of 
the boundary between hardware and software. Accordingly, the use 
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of one vehicle or another to implement a particular algorithnvin a 
particular context no longer is a clear-cut matter. Experience with 
systems involving such decisions is producing evidence that major 
responsibility for these choices may fall to computer scientists. The 
"proper 11 place for this responsibility will be the subject of continu- 
ing curricular debate, with further dichotomization the, likely result. 
Willingness to assume this responsibility implies an extended com- 
mitment to hardware within a computer science program, thereby 
requiring substantial laboratory support independent of the institu- 
tion's central facilities and unaffected by the. operating restrictions 
that such facilities must impose. Consequently, the inclusion of 
active hardware pursuit at the functional level will become an im- 
portant attribute^ help characterize a program's position Within 
the spectrum of computer science curricula. But the breadth of that 
spectrum is not likely to narrow. This ongoing turmoil, fueled by a 
diversity of viewpoints, will continue to enrich the discipline and 
could well lead to convergence on not one but several viable ident- 
ities. \ - 
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1; INTRODUCTION . r .■ ■ : > V," .r • 4, 7. 

Whenever a^erson needs to use a digital computer, it is necess- 
ary 'to communicate to 'the machine the step-by-step procedure to 
9& foIIOwV- This act of communication requires some means of ex- 

* ^ steps in a form that is under- 
standable the person- spedfvifig the steps and the computer 

V- ;," ]$$t$aij is tq Execute them: la this paper we shall Jook at a range of 

* possibilities, from J diredt specification in at language that the ma- 
" cljine is designed to understand, to attempts for getting the ma^ 

. ^^liine to understand a language as close to natural language as 
possible. Historically, the ti&nd has been from simple machine- 
v ; y oriented methods to more and inorie sophisticated programming 
i h ' :VK ' systems that can better support the needs of a general-user com- 
? munity. in^^ proper 
ilf^ review a few^f the hi^p^itits in the 

■' ;V- dtvelopm^^^Bnguages for man-ma6h|ne;^ 

fe . y '. < .» •'. •52 
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1.1. Algorithms and Programs. The concept of an algorithm is 
basic to computer science. It represents the^fundamental infpr^ 
ma£itin that we must communicate to jany computing device for it ; 
to function adequately. An algorithm is defined [1] by the foljow-- 
iiig five properties: ; 7 I 

1. Finiteness. For a computational procedure to be considered 
an algorithm, we must guarantee that it will terminate after "a 

J i finite number of operations. . I . 

2. Definiieness. The step-by-step operations that specify how our 
computation > is to be made: must be described in terms of 
actions that are rigorously and unambiguously defined. 

3. Input. Values that start the computatiori must be specified 
* j clearly. '. . • ' . ■ , -. < .« 

4. Output The normal fraction of any . computational procedure 
"'is*. to operate on the. input valu^s^t6 produce some specified 

• . .output- ' ■ / - * * ] ' ■'■- 

5. ;. Effectiveness: The basic operations that comprise the compu^ 
.■■.■■>-. i:;tation^I sequence representing a algorithm must be such'that 

they can be performed in-ahteffpctivelmanrie^. This concept of 
Effectiveness is somewhat vague, but in principle 'it implies 
that a person using pencil and paper actually can perform' the 
operations' specified. That is, an effective and. definite oper- 
ation n might be "multiplication/' An operation that is definite, 
but not effective, would be "solve a complex rfonlinear three- 
dimensional partial-differential equation in five minutes." 
. It is the purpose of programming lariguageMo let us present our 
coipputatroiiy a computing devfce in such a 

manner that the algorithmic aspects are cleaily specified. That is, a 
'^programming language must enable the user to describe the ''sfc-V 
il/quence of definite operations which will guide t hie computing 
device from the input values through the Specified steps to the 
production of the final output values. : ; 

TWo related questions are outside the scope of this paper: What 
is the finite number of steps required for thlp termination / erf. the 
algorithm? Is the pirogram output correct b^sed on the original . 
function's specifications ?. These iteips are discussed in [2]i 

•'• . ' * !-..';' ' . iv 

1.2. Program Represeiuat^^^ the ideal programming 
^la^guage would be Ehgljbh, ^r some a x>ther natural language: Thefc 
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user^ simply could write the problem specification as* if conferring 

.with a colleague, submit it to the computer, and have the desired, 
computation performed. Unfortunately, this brings us immediately 
to the question of just what an effective operation is. Normally, the 
transformatiop from a vague problem specification in natural 
language to a precisely stated step-by-step computational pro- 
cedure (i.e., an algorithm) is a very difficult one, involving a signifi- 
cant amount of mathematical maturity, experience, and training. 
Although there has been considerable research in this area, our 
present systems are. still not capable of "understanding" a natural 
language dialogue to an extent sufficient to automate this trans- 
formation. Some aspects of this work are discussed in Dr. Slagle's 
article Weinberg [3] considers why natural languages are hot pro- 
gramming languages. ^ ■ 

AUhe other end of the language spectrum, the early days saw an 

^operational sequence specified by actually plugging wires into 
appropriate locations in the computing device. To carry the infor- 
mation from, say, an arithmetic unit to a printer required physi- 
cally connecting the two units. Clearly, thcflexibility and power of 
a hard-wired or plug-wired device leaves a lot to be desired. The 

* concept of the stored program computer, in Which a general 
memory served to store both the data to. be processed and the 
instructions that described the operations to be performed, was the* 
key that allowed, the versatility needed for the development of our 
modern computational ifiethods. However, the programmers of the 
early devices ran into a serious problem: They had to actually set 
the exact values pi the stored program instructions into the 
memory in & crude, laborious, manual fashion. Significant concep- 
tual breakthroughs occurred when it was realized -that once a 
person specified such a program, a njajor savings of effort could be 
achieved if another person Could access this prior work easily and 
directly. Thus, early in the 1950's; systems based on libraries of 
reusable programs cam6 into existence. These made it convenient 
for one person to use the programs created by someone else, either 
as independent processing procedures or as components in larger 

\ programs. Rosen [4] and Sammet [5], [6] have presen^d extensive 
and detailed discussions of some of these early developments, along 
with the factors motivating them. , 
(in ordeV to clarify these language concepts and to illustrate fur- 
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^thert^eTelatibn between program language specification and algo- 
rithmic representation, we shall look at a number of particular 
approaches 0at have been developed over the/past few decades. 

2. ASSEMBLY LANGUAGES . 

2.1. The Assembly Process. If we define the operations in our 
algorithmic specification tQ be the instructions executable directly 
by a. computing device, then is no question but that 1 * our 

operations will be definite aricReffective. An operation such as "load 
a value from a memory location into a register" is a clearly speci- 
fied and effective operation. The actual machine instruction usually 
will be a pattern of zeros and ones (i.e., a binary number interpre- 
ted as a sequence of bits) which, when properly translated by the 
control mechanism in the computer, first will direct the hardware 
to select a particular memory location. The information contained 
in that address would then be transferred into an accumulator, a 
register in which all arithmetic and logical operations take place. 
Further operations may then be performed on the datum moved 
into the accumulator, such as "add on a value from another" 
memory location." ^ '\ , 

N It was soon found that the (conceptually) simple task of assign- 
ing specific memory locations for various items of data is a task 
with a high potential for clerical mistakes. However, it is reason- 
ably straightforward to program a computer to perform this job: 
First, make all of the memory references symbolic ; then tabulate all 
of the symbolic references; finally, assign sequ^irti^l memory ad- 
dresses to the list of symbolic addresses. As a consequence, one of 
the very earliest software tools (i.e., a program developed to help 
people work with thfc computer) was the assembler. This program 
intercedes on behalf of the programmer, who uses symbolic ad- 
dresses and mnemonic instructions and takes over most of the 
routine chores of preparing a detailed^ set ^ 
for computer .execution on a particular type of equipment. Stein- 
hart and Pollack [7] present detailed directions for using one par- 
ticular assembler program, but the principles- of using an assembly 
language are easily learned and transfSred^to ^fher machines and 
systems. ' v 

It is far easier, for a programmer to remember and to \yrite 
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"LDA" to request the operation of "LpaD Accumulator" than it is 
ta remember the series of bits, that must be stored as a machine 
language instruction. In a similar fashion, it is far easier to write the 
symbolic reference "X" than it is to remember where the item of 
information called X is really stored and to always use that as- 
signed numerical location. Thus an assembler is a program that 
can accej^as input a line such as ;\ 

• y -.■^SbA'X/' / 

and generate as* output Jhe corresponding bit pattern representing 
the -actual machine operation tbki must be executed, including" the 
assignment of an appropriate location to stdre."X". 

Figure 1 illustrates the assembly process for one specific ma- 
chine, the Texas Instruments Model 980B minicomputer, but the. 



As part of a larger problem, compute the sum of A + B + G and store the result 

in D. ' , ■ K ; - 



Input 
Assembly Language 



<label> <op code> ; <address> 



ADD 
ADD 
STA 



DATA 



DATA 
DATA 
BSS 



A 
B 
C 
D 



7 

11 
1 



Output 
Machine Language 
(numbers are base 16) 

<location> <contents> 



0100 
0101 
0102 
0103 



0161 
0162 
0163 
0164 



0062 
2060 . 
205B 
8060 



0007 
0009* 
000B 
0000 



Fig. 1. The assembly process. 
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•principle is the same from microprocessors to extremely large ma- 
chines. The TI980B uses r a memory organized into 16-bit address- 
able units (words). One word may contain an instruction of the 
form: 

0 4 5 6 7 8 15 



<op code) 



IXB 



(displacement) 



where: <op code) = 5-bit code indicating the operation to be per- 
formed./ 1 -\ 

IXB = 3 bits to describe how the machine will interpret the 
address specification. 

(displacement) = 8-bit field whose exact meaning depends 
on the setting of the IXB«bits but whose^basic purpose is 
to indicate the address of the datum involved in the oper- 
ation. In the example of Figure 1 (IXk = 0) the address is 
given as a relative displacement from a reference address. 

Memory words also may contain^ character codes, logical 
switches, integer numbers, floating point numbers, or anything that 
we wish the bit patterns to represent. In Figure 1, integers were 
assigned and stored for the values of A, B, and C. Since one hexa- 
decimal digit (base 16) represents any one of 16 possible values, it 
corresponds exactly to 4 binary bits. Thus, each of these 16-bit 
binary numbers is expressed as 4 hexadecimal digits simply to 
shorteiMhe number of characters required to write then?. 

V The assembler statements consist of three fields : 

(label) (op code) (address) ' 

where the fields are each terminated by one or more blanks. The 
operation codes used in the example consist of the instructions: 

LDA X = L6ad the accumulator with the contents of memory, 
location X. 

SJA X = Store the contents of the accumulator in memory 
location X. 

ADD X = Add the contents of memory location X to the con- 
. • ♦ tents of the accumulator, leaving the sum in the accu- 

mulator. 
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The TI980B assembler actually uses a total of 99 such instnffctions. 
Tfiere also are assembler directives (i.e., instructions for the as- 
sembler program itself to direct the code generation process). For 
instance, « * 

<label> DATA <value> directs the assembler to assign the cur- 
rent location as the value of the symbol in the label field 
and then to store the given value in the current location. 

<label> BSS <count> assigns the current location as the value 
; of the symbol in the label field and then advances the 
current location by the number contained in the count 
field in order to reserve that many locations for data to 
be stored later. 

The TI980B assembler has 21 such directives. 

One computer software tradition was established soon after the 
creation of the first programs: Once.a program is running, some- 
body will think of an improvement B or an- enhancement that must 
be mad& to obtain a new and better program. Thus, the basic 
assembler had many features added to it to make it more con- 
venient for tjie human to interface with the computer hardware. . 
Although the assembler already had built into it every possible 
machine instruction, assembler directives were added to aid it in 
allocating storage for data, printing, and spacing of output into 
more easily read format, conditional features to allow code vari- 
ation, and various options for saving the generated machine code 
for later reuse. All of these enhancements were^ directed toward 
making the human/machine interface a smoother, more easily navi- 
-gated-boundary: — — — - — — : ^ 

2.2. Macro Definition. An interesting early observation made by 
programmers was that they were continually writing the same pro* 
. gram fragments over and over again: As long as there is a block of 
code that is to be duplicated, with perhaps only a few changes to be 
made from one copy to another, it again seems reasonable to let 
the computer do this essentially * ntfechanical work. As a conse- 
quence, the concept of a macro definition was implemented in most 
assembly language processors. _Th|s procedurejtliqwsja_xode string 
to be defined once in terms of a set of parameters and an assigned 
name. Parameter values then would be specified later in terms of 
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variables to generate the actual desired code. Wegner [19] de- 
scribes the details of how this may be accomplished. 
^ Figure 2 illustrates this process and demonstrates some of the 
power and flexibility that the macro capability provides to the 
programmer. In the example the name of the macro is SUM, with 
parameters A, B, C, and D. All of these parameters enter into the 
address fields of the instructions contained within the macro body, 
but we could just as well have written an operation code or even an 
entire instruction as a parameter. The macro body may be con- 
sidered as a template in which the formal paran^ers are replaced 
by the actual parameter values at the time of th^Rl of the macro. 

Macro Definition: MACRO SUM A,B,C,D 

* LDA A 

ADD B 

N 'J • 

ADD C 

» , STA D 

• END . 



Macro Calls: 



Input 



SUM X,Y, Z,A 

I- . * 



~LDA 


X 


ADD 


Y 


ADD 


Z 


STA 


A 



SUM A,B,Z,X 



LDA 


A 


ADD 


B 


ADD 


Z 


STA 


X 



Fig. 2. Macro definition, 
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The correspondence between formal and actual parameters is de- 
termined strictly by the ordinal position of the parameter in the 
parameter list. f < - ■'>. 

Although the simple substitution discussed above may be very, 
convenient, most of the real power of the macro concept arises 
when other features are added. For example, a macro body also 
may contarin a macro call, producing nested macro calls. Con- 
ditional directives may be' added, in which case a given block of 
code may be generated only if certain conditions are satisfied at the 
time the macro is called. Finally, the macro definitions themselves 
may be nested, allowing parameters to be used to change dynami- 
cally the macro definitions.^ 



2.3. Summary. By ' iniplenfSinting algorithms in assembly^ 
language, the)programmer has complete access to all of the capa- 
bilities of the hardware. As a consequence, a great many of the 
operating system programs (i.e., programs dealing with the details 
ormemory management, input/output, control of task execution, 
etc.) and also the ultt^y programs (i.e., copying 'and reorganizing 
collections of data, sorting programs, etc.) are frequently still writ- 
fen in assembly language. This also allows the programmer to 
obtain the utmost computer efficiency for execution. The process is 
one in which the programmer specifies his algorithm in terms of the 
allowed symbolic language statements, passes these statements as 
input into the assembler program, and then collects the output as a 
series ofi machine instructions for execution. The output code may 
be stored on some external storage device such as disc or tape. At a 



later 



timeT^fienTitTs^desired to execute rtie progfanFlhus as~ 
sembled, another program (the loader) will be instructed to go to 
the external device, pick up the appropriate assembler output, load 
these instructions into the computer memory at specified locations, 
and. then transfer control to those instructions for final execution. 
The entire process consists of the four sequential stages: 



7/ 



1. program preparation 

2. assembly 

3. loading 

4. execution 
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3. PROCEDURE-ORIENTED HIGH-LEVEL LANGUAGES 

A line of assembly language code produces essentially one ma- 
chine language ^instruction. A line of code from a "high-level 
language" may produce an arbitrarily largfc number of machine 
language instructions. Moving from an assembly language program 
to add two numbers and to store the result: 



seems like a relatively small step. In practice, of course, we want to 
go in the reverse direction, from the higher level to the lower. 
However, higher-level languages accept statements that may con- 
tain a complex structure that must be analyzed before the output 
(essentially a set of assembly language statements) may Ufe gener- 
ated. Use of a high-level language may achieve a large improve- 
ment in the ease of communication from man to machine, but only 
at the expense of requiring the computer to perform the extra work 
of structure analysis. However, this approach still requires the 
effort on the part of the programmer to; understand the intellectual 
concepts involved for the program preparation, since the necessary 
algorithmic steps still must be described in detail. Essentially only 
In^ine^dependent mechMicalndetails~arersaved-by-the-use-of a 
high-level language. 

There is a basic difference between the formal artificial languages 
of the computer world (PL/1, FORTRAN, PASCAL, COBOL, etc; 
see [8] for a more detailed description of many of these languages) 
and a natural language such as English. We know how tp do. a 
complete structural analysis of a formal language (Section 4), but 
we do not know how to do the equivalent analysis of a natural 
language. The program that performs this, language analysis and 
generates the corresponding machine language code is calledthe 
compiler. Higher-level languages continue to evolve, along with our 
ability to create compilers, so that source languages are. becoming 



LDA X 

ADD Y . % 

STO Z, 
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more convenient J6v human use. However, we are still far from 
being able to write compilers that accept a completely "natural" 
language as input. Variations of FORTRAN will be with us for a 
long time to come. * 7 * 

• • * * 

3.1. Data Elements and Variables. We sh^ll use the expression 
"elementary datum" to describe a basic unit of information that 
can be accessed and modified as a single value within a given 
programming language. Normally, elementary data items will con- 
sist of such things as numbers, either integer or floating point, 
character strings, Boolean values, bit strings, or references to actual 
machine addresses (called pointers). Since/these items represent the 
fundamental uriits of information that must form the basis for any 
algorithmic development, it is essential that a high-level language 
provide a mechanism bywhich the programmer may store, retrieve, 
and manipulate th^se items. The standard mechanism is, to provide 
symbols called vdriables to name the datum. Additional symbols, 
the operators, indicate what processing is to be done. 

The normal basic variable in a standard language such as FOR- 
TRAN or PL/1 essentially names a given location in thexomputer 
memory. A reference to this variable, then, is a reference to the 
value stored at the indicated location. Thus, a statement of the 
rform r 

■ X = Y ' . 

may be interpreted as telling the computing system to go to the 
- location-named by X-reproduce^tlie valu_e_of the ite m stored there, 
and move it into the location named by X. This seemingly simple 
operation may invoke a number of obvious and also subtle prob- 
lems. For example, if the datum named by Y happens to be an 
integer numeric value, and that named by X is supposed to be a 
character string value, then what does the assignment of values 
really mean? Clearly, in order to maintain a consistency of infor- 
mation described by the symbols, one of two alternative disciplines 
should be enforced: 
■ . >' * ■"■ ■ / •". W ■ ■ » ■ ' ' ...... • 

1. Data elements, on being stored into named locations, must 
take on exactly the characteristics implied by the variable 
— £iame with its attributes. . . ' . < . 
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2. Data ^elements, on being stored into named locations, must 
also carry descriptive information giving their exact character- 
- istics. ~*>— 

As a consequence, all languages must Ijave specific rules as to what 
elementary data items they allow, what elementary data items may 
be automatically translated from one (internal) form to another 
(and when), and how the variables take on (i.e., how the machine 
"knows") the appropriate attributes of the assigned data items. 

Thus we see that a fundamental decision must be made in de- 
signing a programming language: Are the variables to have fixed 
attributes defined before the compiler processes the program (as is 
the case in languages such as PL/1, FORTRAN, and COBOL) or 
may the attributes vary, as defined by the data stored in them at 
the time the program executes (as in APL, SNOBOL, LISP)? As a 
gross generalization, we can state that the former tends to produce 
more efficient code, since the compiler knows more about the data 
types to be processed. The latter tends to produce more flexible, 
easily used systems, since the programmer does not need to be as 
concerned with the details of the data while still writing the 
program. - 

To illustrate this dichotomy further, the PL/1 language requires 
a complete specification at compile time of the type of information 
that will be associated with a particular variable name (the type 
information may be provided by the programmer or by the 
language default values if not explicitly specified). That is, during 
the compilation of the program, the PL/ 1 compiler knows that a 
reference to the- symbol X stands for exactly one type of datum, 
such as numeric value with the attributes' floating potaf, base 10, 
HT:Tsi^ffi^nra 
unchanging throughout the execution of the program. All gener-, 
ated.' code- ..that refers to this variable will be specific, thereby mini- 
mizing the amount of data-checFing that must be performed. In 
contrast, SNOBOL, primarily a string manipulation language, does 
not have any preassigned attributes associated with a variable 
name/Consequently, a very dynamic type of environment is cre- 
ated in which the attributes associated with a particular name will 
be defined (bound) onlyj when an actual, value is stored. We may 
write an assignment which stores an integer number as the value of 
the variable X at one point §n*a program, and then a later assign- 
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ment may store a character string for that same variable. To handle 
this situation, the executing code must at all times check the type of 4 
information stored for the particular variable to ensure that con- 
sistent operations are being performed. If the variable X contains a 
character string and X appears in an expression requiring numeri- 
cal addition (X H- 1), appropriate instructions will be required to 
convert the character string form into a numeric form for the addi- 
tion operation. If X was then reassigned a numeric value, the code 
required for the evaluation of the same source expression (X ^ 1) 
would be quite different. 

It is an unfortunate situation, but a very common practice in 
mai\^ current programming languages, to use a single symbol to 
stand for two entirely different operations. The exact meaning may 
be inferred only from the context in which the symbol appears. One 
example of the problems that this practice creates comes from the 
'use of the equal sign, " = to mean both an assignment of value 
operation and a relational .test for equality. Thus we have that 
strange-looking assignment* statement that has confused gener- 
ations of fledgling programmers, 

X = X 1, 

where the " = " really means "assignment of value." In PL/1, this 
dual meaning produces an even odder looking statement : 

The first equal sign is an assignment operator and the^second one 
is a relational operator. Thus A is assigned the value^df "true" or 
"false" depending upon whether or not B has the same value as C. 

~3.2. DatarStructures and Storage Structures. ^o'nrtally,-th^IogicaL 
constructs involved in solving a problem or implementing an algo-' 
;rithm : require more than just the elementary data items. We use the 
expression "data structure" to describe that particular logical entity 
4 that represents the information that we actually wish to manipulate 
in the algorithmic process. For example, we may wish to deal with 
the concept of a set or a graph as the basic structure in our process. 
Two particularly common types of data structures occurring in 
scientific and mathematical computations are vectors and matrices. 
These ""arrays normally are composed of elementary data items as 
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"previously described. In the algorithmic sense, if our data structure 
consists of sets, then 0 the operations we might wish to perform 
probably would include such things as "is a particular v element a 
member of a particular set," "find the union of two sets," "find the 
intersection of two sets," etc. These are welUdefined operations on 
^ell-defined data structures. However, when it is time actually to 
implement an algorithm oh a real computer we are faced with the 
question pf how to store a "set" in a, computer memory that . has 
only a linear addressing capability. Also, we must determine how , 
set operations are actually executed; given a particular method of 
storing the set informaitioit. The implementation of an algorithm 
always requires an answer to the question of what storage sVucture 
should be used to represent a particular abstract data structure. 
One of the most common stumbling blocks in algorithmic im- 
plementation sterna from failure to differentiate between the data 
structure, which is conceptual, an^ the storage 'structure, which is 
an actual realization. 

Figure 3 illustrates the comparison of data structures and stor- 
age structures for a vector (a one-dimensional array) and a matrix 
(a two-dimensional array). The notation is to indicate that the first 
element; of the vector, V(l), jnay be stored in the addressable stor- 
age location "L". If an element occupies "s M addressable locations, 
then the second vector element, V(2), will be stored at location 
"L + s". For example, the IBM System/370 computer attaches a 
separate numerical .location to each byte of storage/ and a double 
• precision floating point number occupies 8 bytes. Hence if V(l) is 
stored at location 1000, then V(2) must be stored at location 1008, 
V(3) at 1016, etc. ■ ' ' 

In the full linked list forth Ihe notation means that one cell 
contains: , ^ , • • , . 

L The datum 4 ' *■ \ 

2. The address* of the next cell. 

. f ' • ', " ., 

the lafet cell containsji special value in lieu of , the next cell address 
v to indicate the termmation of thb list! The matrix form also may be 
stored in a solid column* major order, or in one of a number of 
possible variations of, the linked list forms.- Extensions to; higher 
dimensional arrays may; easilj^bie accommodated. % 
Given a storage stru^iire implementation of a .data structure, it 
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■ >V>, - -Matrix HbvrMa|0r ^^J^^Stb'i'age . Structure:- 



location . : , ■ .Content : S 



» -. ..Mi? 



♦ r\'^ IG - : 3i'P|Ja istructurwan<|-storagc st^cttircf'. : • ^.^^i'^p;? 

^; f 3^?? iis necessary tpi -^^^^^He mqpgiig . function that lets/the " 
^ /!P?V^ c ^?^iw^ ; 6f";"T^ >l stpra^e'>tru^ r re when- 
evw^fo dorriesporiding^ 
^ ' in the algorithm, For. example/, given the dafa structure r for a " 
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PROGRAMMING LANGUAGES ^ND SYSTEMS 

matrix, how does the program 'actually find the -loc 
vWtie^-'fh^'tWuin'is required? The matrix example ;|s an interesting. 

bne;in tlie sens©-that, for a solid representation, |t is ^ssible to 
^compute a specific location for the desired element Jdirectly from j 
«he indices' in1he arraj( reference. Thejnap$)ing function in this case 
; would Be (for roy/<naj or sforage order): ^ 



y. • L[a(i, L[a(0, oii] V s \ (d(2) * r^g 



<?■ where : L[ ;] = "the location of • " S . . - ; . . if-. '., "if f||t 
) : = size, in addressable storage , units, of^one jdatiirt item 
(normally fixed by : the compute^syrchitecture). 
■* * Kv^(^= number ; of elements li£ the second dimension (the 
J fj :: *:\ rows) of thp matrix. ; j- " 

.Thus it is sufficient to .kno^the values of L[a(0, »)], d(2X and s to 
store, or retrieve all of the elements of any two-dimensional array, 
based on this storage structure. ■■■■[^ , 

' In some storage structures, such as in jhe vector reptese^tio'n 
using v a full-linked list, it is not -possible to "get the fifth elemtfrit in 4 
*the list without actually .stepping ^through 'the list by finding>the 
location of. the first, seeond; third* and fourth elements) Onl^the 
fourth element contains the required location" of the fifjlh^ elem^t.; 
In the sparse form, the fifth <^6ment will* appear only if itfia needed v 
(say for nonzero values being stored). Consequen tly, the j subscript? ^ 
values must: be, checked during a linear scan of thfe list itt o 
determine whether the desired element has bfcen found.br if it is ' 

pissing. •• / ' / \ " : *'-,.\:' \ 0 

. Element retrieval questions, however, are not the only important 0 
• aspect of Ihe storage structures used t o implement a particular data , 
Structure concept. Consider the case in which it is desired}tb. insert : 
• . a new elenient betwieen the fourth and fifth, elefnenfs ih an existing- 
Rector. Given a packed representation of the 1 vector, the only way | 
sUch ^n element could bl inserted is by physically moving all of th|, 
forllowing elements down one location in the stb&ge structure. In a;*. : 
; ^1 i Ak^^ Mst^ a h owe yer, such an insertion can be made simply 7%? <v , 
> cfea^hg a, new element ; and ; changing one 'address pointer; It is ^ 
cph^dei-ation^ such as these that indicate that; the- study S^ata 
strffltures and storage slrffctares has >a , tremendous iinpact bmjthe 



..4?:; 
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Y-'^ ■■■ • ■ ■ - ■ ? 

• efficiency with which computations are d^sighbd. Pursuing this 
topic further is, beyond the sebp of this paj^ Here w limit 
ourselyes to an examination of the languageiechniqiies by which a 
desired structure may be implemented. - ♦ ; ,v ■ ; 

° ; ', : . f- ';-- i v' " ' . ■^■'■'■^ " : • ••. :• 

?\ 3.3. Control Structures. When an abstract algorithm is^mplemeii- 
ted, a specification must be introduced in the implementation 
language to indicate precisely 4he sequence: of operations tq be . V 
.performed.' We shall assume that the sequence in which statements 
are Vritten will be the seguence in which they would nonftally be 
expected to be executed. However^, variations from this 'nornial 
sequence must be allowed so that^j^^lic operations, conditional 
/execution of statements, and othei^.M decision-related events can 
be accommodated. \^ • ■ ;../.';. 

In somG?pf the early higher-level languages, such as FORTRAN, 

• it was thought sufficient siniply to carry oyer the assembly 
. language control structure, embedding it in the higher ' language. 
. Tfiius,*we find such statements as % r . ^ 

'-1$: .'--v. : . %■>■: *V' : ' ■ ' = ; ' i" ■ • ■ . ■' . f 

W# : goto io :: \y. , 

for the unconditional transfer of control, and the arithmetic IF, 

" ' :-. ; f ;\ ;« : if (X) ib,20,30,\ ;: - ■ ■ j ■ Y * , 



tests ,a / numerk for negative (GO TO 10), zero 

'''^■'•^^{GO^Q 20)/ or positive (GO TO 30), to^produce, 1 a conditional 
tran$fer } of control/These sta^ments-esserii^ly' i: represent a direct 
,v ; v carry-over from the machine language of th<: computer on which 
FORTRAN originally was designed aricl implemented/ Unfortu- 
'V'hate|^^kternents' of this type induce a tendency.to Write programs^ 
" invotfehig a convoluted fegicaf of control. With transfers back £ 

and forth between small segments ;bf coding being quite common. 
. '. ; It has been found in practice that programs haying this" fragmented 
: characteristic are very difficult to debug and maintainj$pecifically, 
it is very difficult for someone (even the original programmer!) just 
to trace through the logic of a program in order to establish exactly 
what the* code will do in a given situation. We shall return to this 
point shortly. : > ,„ ■ / ( ^#^/ \ ^Y' • -V : 

^mputer code re- 
va number of sep- 
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■•• ' : i ; . ; -v.-.^ j" ; ■ ■ • ■■ ■ ; :'..V ■ 

aratea^d distinct factors: , . 

1. The actual -sequence of statement^ as written by the, pro- v 
grammer. . -\ . -..v 

2. Within a statement, the order of eya^ajlibn of the operators^ 
as defined in the language itselfa '.'-U^Pl 

3. Techniques used v^itKin the co^mpile^t^prQduce the final (op- , 
• timized) output code.| . ' . ^ |/' / '. ■■ : v ' 
The first item is clearlyj^^te the cbnt^l $>the programmer,, but ^ 
the bther tvyd have eff^ 

' The result of a sequence! of bperaticJ^ single statement, 

can depend heavily on the properties 'of the ^bmputirig devic^on v. 
wtiich the statement code is executed. Qrt^ o 
lems is that computer arithm^ 

and hence differs JrSm definitions, /For 

example, - floating- p^ ast we .would 

assume hdrmaL Arithmetic ^qaitioi^ is, donsider the 

sum A + B +' G, >yhere ^ t he^- Va 1 ]^^Vof - ^j^^^J^- a re slightly ^ter^ 



than the smallest number^hat,Wi^j of T^^ve ' 

perform, 
be the same 
value of A. 



((A*;b)^ ■ , 

As. a consequence; the code ( generation al^9ritKms>sed by th*" 
compiler may aftect ^ theci^ math- 
ematical sense tjie ordw ^ : • 

Another significant cause'^ a single;?tate* • 

* it: ...» _< , iir.._„»!„..i'»t,ot ;«..^T>**c;HAVfTOrfc PnnsiHftivfhfi 



ment, is thd use offuhctlQfi^at ^ in^|^si^e;eft^'cts.. Consider the ■," V; 
simple example oFa fu^c|idh/.caU||^I>F(X)f , thaT returns ^ 
value (X + l)/2, but alsot changes iH^vaJue' of .the argument X to v 
X + l.In PL/1 this functidii § (^j^titt||Ssp- ;.,,>;•'. i': ^ '•vvfl 
^ ' PROCEDURE HA£^xCREfuRNS^(FL& AT); ; 

..... -; : - ■ . . ^ jfx^XmM-' ' ' 

• , . END HALF; ; " ; ' : ''\ f V'U, : . 
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■ -The expression X * ; HA£F(X) now becomes ambiguc 
bn^-.i Valuer depend£,upqri tKd>rder°in\ which the operatic 
v v -^med. ThaUs, first findirife the value of HALFOQ and 
^~^V0|>iyi^ by the value of x; will not produce the same r^u^BPtakirig 
'^|^1^fe' y -^ of '^. : ? ild ^^' 11 . multiplying .it by Jhe valiie of H^LlF(X). 




Se sequetic#: 

^5£afiSftr^;-:-;/^> : ; y = x.* half(x) ; 



; <; : ■ r^^Ttl^gif c^^uce 8 r " fioi^'^be former case ^hd 6 for the latter. 
\; Witfiin the past ^ecadera^ajftr question in computer science ■ 
has been to d ecide-whaiUty^ ,f ea - . 

♦tures should be included in, new language designs. The objective of 
the$e structures would be to : v . 

o . -^Maximize the probability bfwriting:a correct prograih.. r \--V'. 
.• . /^ of the code. ' 

^ 4 l: Make it easier for, a CQ mpilerJo-analyse-theAflbw of control to 
^ machi^b;code that is generated. > 

v'/The results of numerous studies cl^rly indicate 'the program logic 
-can be more clearly written and explained if the.possible language P 
^ constructs are restricted to a few fundamental contort structures. 
. These control structures all have one very- significant feature~in ; 
X common: 'There is only one entry poin\into,the structure and only; 
. one exit poiritjfrbm the structure. This is in contrast to the unre- 
stricted use ojf^ the traditional, GO TO method of control transfer. 
;! McCjowan arid Kplly J [9] discuss these motivations 6xtensj^yv'" 
• / The Basic allowable control structures may be seleci^|r^nl Wf\i 
number of potential typK biit. the followihg. set is su^dent. Jfor 
'most algorithmic design or programming tasks : y : ; ^ ? 

W h Normal sequential/spguenting \ " 
Jf; 2rlF - TtfEN - ELSE V— * ' • 1? >- 

!3.'DQ^wi|itE;^^^.; r>C~ ■■: ; % ' 

• • ; *>: ^ , 4. REPEAT; ^'UNTJli;^ J$XX : ' :r " ''. ■ ; • . ■• V 

> - v Figure 4 illustrates hoW iKefe s^|tures affect the flow of control' in, 
"v- : • a normal program/ ' ' : /./*'.'; flfy: - * : 

"Not only do these striictur^ constructs enhance ciarUy pf ex-r 

j pression of an algorithm, 6 1^ also aid in desigiijg^ a co^ct pro- 
• gram by using a top-down- design approach [9^If ; jwe ^st^t by 
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2> ir-THEN-ELSKj 





Fjg. 4. Structured control blocks 



expressing 011^ algbritr^ in terms of very large scale °P er ^ tioi i 1 ^j^^ 

thereby limiting its de$6rirjtio n ta£ k^^^^^^^^j^ dotftrol ' : 

structure^ we may. essentia A 

that we haV^^ritten: A series of 

every level, a complex operation 

detail, will still maintain the ori#ii$-1|^ 
: each^efinempnt is itself correct. This |f|p^^ 

continued^ ever-expanding detail ^JttS^p^ > 
^ ''been specified at a^Jevel corresponding to that of the progmmmihg 
t. language we are :usin&; One important advantage of using a high- , 
-V ■ . v ■'■" : ( " '^-M ' '. • 5;vV '.: v '.>' . ' ' 
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r • ; ... • ' • • ' ' ; .. ; t .;/ 'i . ■ •.'•'* ' . .*. 

" " level language is that thi^'J^ement^may stop much sooner than it 
, r v .^; v J.?^ftW ha ve if we were com nfcl lecbto;refine all the way down to 
v machine language jnstruc'ti'^ follows that the 

: v.. ' higher the level of the language* ^available to us (the richer the 
language and data constructs), th/easier it is to it^liment algo- 
• • rithms correctly. ■ . '/' ■ \* : $J0^y > 

v^'; ■ '"' ' . ! , . : . ; : - . "•■"">> 

. 3,4. An Example Program, In order io clarify some of tlfe pre- 
vious remarks, let us look at one;sample problem. We shall illus- 
■ \ .. trate the development of an algorithm; the conversion of the algo- 
rithm into a program, and the corresponding control arid data 
structures* needed for the implementation. Sorting by .^insertion is 
: the . problem chosen tot illustrate these ■ principles. In Figures 5, 

Y ■ ; column 1 represents a gross* statement*" of» (he problem. Column 2 

V > • represents ^an expansion of column 1 that includes some of the 

detaiisibf. the algorithm being implemented, but sittl not in suf- 
; ficient^jetail to code directly. Column 3, a further refinement of 
f f column 2, is much closer to the levej of detail requited for cording. 
Column 4 represents the final conversion of the flowchart intathe 
" PL/1 programming language. , r •. 

At each stage in , the stepwise refinement of the flowchart, one 
operation; was expanded into a more detailed description pf4he 

blocks l^ted in Figure 4.'Thus the 
one-in/one-out control logic is always.'>maintair 



- Qsimple.one-in/one^ is aTways.>iriaintained''duringi 

^ r development of the program. After the aigoritKm Jias been- 
y. ^ failed sufficiently by/t^hi^proces^U; rtiay be conVertga, into anjr 
; laiiguage. A language with-p^^to 

\t£0f- data structures closely resembling what appears. in tfe r^SlJ^i?o b lem* 
iu£ .^rr wilt clearly make this final coding task ;easier ^ap |f^totally 
'unrelated language was used. '^-Y^ 

:'Ub&>-\ 4. HIGH-LEVEL LANGUAGE TRANSLATION ' : 1 

, w Detailed specifications for two separate aspects of aerogram— ^ 
; : .^ing language must be presented in order to describe it completely. 
/The SYNTAX (i.e., .what symbols may appear in what order) must 
V be unambiguously. ^defined in order to establish wh|^l legal sen- 
; I tence is. Furthermore, once we hav6 ■ identified ( a i^g^ 
; r must also specify 3^£A^NT/CS (i.e., fhe metiimg oftto* Bar- 
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ticular phrases that arc introduced into the language). Tficle two 
sets of specifications then must be encoded into a program (the. 
"compiler") which will actually translate the source languagc;.state- 
mcnts provided by the programmer (described by the syntax^into a 
set of machine language instructions (described by the semantics). 
Aho and Ullman [10] present an excellent treatment of this entire 
process. - ■ ' 

As our knowledge of language constructs has improved over the 
past few years, we have' been able to develop software tools that 
incorporate algorithms for the transformation of these two specifi- 
cation sets directly into a working compiler. This type of software 
development tool makes it possible for us to create and test new 
languages and language features at a minimum cost. The creation 
of a new compiler today is not the vast multiyear, multiperSon^ 
project of the 1950's and 60's, but, instead, a project in which the 
dcvcJopcrs need concentrate only on the principles of interest. 

4.1. Syntax Specification* How do we know the statement 
X.= A + B is a valid FO&TRAN assignment statement? This 
question could be answered by listing every possible valid assign- 
ment statement and then searching the list every time we wanted to 
see if a g|\fen statement was included. However, the size of an 
exhaustive list makes thjs approach impractical,* " An .equiv- 
alent cli^k may b|^one~ by a constructive approach, sufficient for 
the types pf jangyages thitWe areiConsiderjhgv by defining the 
languagey^li^^ grammar. A context free 

*grammarTCFG) consists^ fo^^artsr 

1, TmmWs— the basjc symbols which compose strings in the 
language (i.e.,- A,B,0, 1, + , - , . . .). T ■> • - 

2. Nonterminals— -special symbols that denote sets of strings, (i.e., 
( variable)* (assignment statement). The and are con- 
vent^ brackets for these 0 symbols.). f Also referred to as 

^ r phrase names; K 



.. Start denotes the set of 

strings ' cbn^istinglo^ all of the^alid'^^teihehtsf ' iii " wlBcli"iw8 are 
interested. ; 
4. Produdtion Rules— these ruies define the ways in which sym- 
bols and strings may be legally combined to produce new 
strings. The nfemcflboqtcxt free" implies that the production 
rules are restricted to be of the form: 

«.-.-. ■■ •' ■ ■ •■ ' ' 

"nonterminal symbol"^ "string of symbols" < V 

where this rule implies that, for any occurrence, pf the symbol 
onrtiS; left, it may be replaced by the string on the right. 

: To illustrate how such a CFG allows us to recognize valid state- 
ments in a language, consider a grammar defining a simple arith- 
metic :.'i|signment statement language: V 

1. 'ftminal symbol set = (A,B,C,D, ... ,X,Y,Z} 

2. "Nonterminal symbol set == «V>, <E>, <A>} . \ , 

3. Start Symbol ^<A> . £r£0 ; 

" 4: -BrpduQtipA& - \ 

, ■ - : r (2) <E>-> (E);iE) |F) - V^"- 

,f (3) <V>^A|B|C| :; ..|X]Y^ - 

An additional notational convention has been introduced to short-,,, 
en thfe-Wrtog of, these rules/When two rules have the samd left-, 
hand side^mboVthe rules are .merged using, the symbol u |" to 
'i4 represent the alternate right-hanc} sid^ strings froift the oKginal 
Grilles. Thus, the nonterminal symbol <V> denotes (he set of strings 
consisting of ^upper-case single letters. The symbol may also be 
descriptive of the phrases it names, such as <Variables> or <V>. 
— If we begin ^ith the start symbol,; yye njay use. the pro^gtion 
f iules to substitute strings for : ;no^ to produce; 

v sirings called.se/ifeflria/ forms: : - ^^/MfP- f*' "4 ' < 

T%A> <V> -= <E> : — ... /-'.ts^iftntiaUoriAf^le Pl l] . 0 : 

■> => <E> [sehteti^ 

: ; -^/^*i<E>+'<E> • [se^^i^^l^^a] : ..A 
; ; ^P^ : <y> -f <E>^ [sen^fiaji^rm, rule P1.2b] \? 

'Ml$$t + <E> : .[sentential form, rule BUa] ..■■*?* 
• ' •^;':vX.^ <V> . ; ; [.sentential* form, rule P.^J V 
. \'''^,'Jsj^'. X==*A^KB -v.',' - [sentential form, rule PlSb]; J 
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sentence 

the language genera 

Since we have worked ^hfiW^h a derivation showing how*the 
string X = A + B may W -developed from the start symbol of our 
sample grammar, we see that this • of^triV\gs 
denoted by <A>, and hence it is. a sentence 
ated by this grammar. That is, we cbnclud^iliaf^ is a 

valid assignment statement ! Note, however^gt we still have hot 
said anything about what this sentence means^;^. : * 

A parse tree is an alternative method of showing the derivation 
that establishes a particular sentence in a language. This tree i$ 
constructed by Using a nonterminal symbol as a tree node, with the 
right-hand-side string symbolics (Wpendent nodes. At any stage in 
a derivation, the tree leaves nujjtfc up the sentential form. Corre- 
sponding to the previob^-xtefivatipn we have the following parse 
tree: ' ■ ■ ■ ■ ' ' '• ' — •' 



<A> 



Njoje that the derivation 



X 



<A> =f <V> = <E> 

<V> = <E> + <V> 
' => <V> = <V>.+ B 
,~v => X .= A + B 



T 



<V> = <E> + <E> 
<V>«<E> + B 
<V> = A + B 



applies the sarhe rules as before, but in a different drdfct'. ^ However, 
the/parse for -these alternative derivations are exactly the' 
same; hence the recognized phrases in the sentence also We exactly, 
-'the samer ■ 
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If we wish to add multiplication to the set of legal operations in 
%ur language, we might try the following modified production rule 

: . .r * *2: • (i) . <a>-» <v> = <e> v . 

(2) <E>-><E> + <E>|<E>*<E>|<V>. 

- o) <y> a i bi e i . . : i • v 

The foliowing two derivations, equally valid '.based on the above 
rule set, produce exactly the same final terminal string: 



(1) <A> .=> <V> = <E> 

. => <V> = <E> + <E> 

<V> = <E>* <E> * <E> 
... =>•*. X = A + B*C . ; 
(i.e., add A to the product of B and C) 

(2) <A> => <V> = <E> 

• => <V> = <E> * <E> 
" •• : => : <V>#*<E> + <E> * <E> 
=>* X = A + B*C 
' (i.&., multiply the sum of A and B by C) 

but with the two different parse»trees: 



(i) 



<v> 
x 



<A> 



(2) 



<A> 



<E>^ 

<f>' <f> 
A <Y> 



B 



I 



A V 



94 
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In general* If the samcVtcrminal string can b^roduccd by two 
different derivations that correspond to. two dilTcrcnt parse trees, 1 
^then it/ is not clear what symbols must be associated, with each 
nonterminal string set. This situation is termed ambiguous; and th6 
grammar that allows iti an ambiguous grammar The definition is 
intuitively satisfying since we saw that ip iKc last example, using* 
P2, it was^not clear which of the possible panes was intended ; 

(1) . .. . • X 5s.(A + (B * C)) . 

or * .... , . I • . ; 

(2) XM(A + B)*C) 

f! . Thus, the question of ambiguity centers around the uniqueness of 
the parse trees, not the number of possible derivatives. " 

As a generat principle, we want our. languages' to be unaitffiftgu- 
ous so that when we write* a particular. statement in the language 
wc know precisely the interpretation that the compiler willplace 
upon that statement. We saw an example of a context-free gram- 
mar that contained, both addition and multiplication operators in 
which the grammar was ambiguous. Can we resolve this ambiguity 
problem and still retain all- of the desired features of the CFG 
approach? The solution requires that we introduce the concept of 

operator precedency irf this case multiplication is an operation that 
should take precedence over addition. Consider the following re- 
vised set of production rules which include ^both operator prece- 
dence aqd the ability to use parentheses to force & desired order to 
a calculation: 9 it'.' ^ ..; 

. l * ■ ' • .* 

,; , P3: ' (1) <A>-><V> = <E>. 

• . , v . (2) <E>-><E> + <T>|<t5 

$& : 0-PV, <T>-><T>*<F>.|<F> 

; <F>-»«E»|<V> - 

<V>-?A|B|C| ... |X|Z. . 

The following pajse tree illustrates that, in this grammar, multipli- 
cation takes precedence, over addition. The phrases involving the 
multiplication, operator are forced by production rules P3.2% and 
P3.3a to be eKpresse^ 
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j Is. multiplication musj be '^formed before any addition jnayjbc 
undertaken. V •' . 



< A> =>'♦ X - A 4MB * C 

;<A> V 
<tf> - J> „. . <fe> 



v>, ■ I- ; <f> ' • 

". . • ;,*<?> "<f> ,;; Ax 



• •• ■ A B » .• ;, v ■ • 

4.2. Languoge^^^^tion/We have seen how to generate sen- 
tences; but "th^rSal problem we face is tha^f the input tp 1 the system 
is a given terminal string, and we must find put if there is a parse 
tree that will connect these. terminal symbols to the sta* symbol of 

/the grarhmar. One of the inost direct methods of establishing the j 
parse tree, if it exists, is to use the recursive descent algorithm, ^ 
.top-down method. That is, it begins with the start >ymbol of the* 

-•grammar as the roof bf the parse tree and'tries to build the tree ' 
down" tb connect with the symbols jrf the source string as leaves in 
the 'tree. It accomplishes this by repeatedly asking the question, 

. Can 'an instance offthe desired nPnterminal string class be found in 
the input string? :Y , V ' * \ 

.Unfortunately, direct application of this top-down method to the 
sample grammar produces an infinite recursion for the rule 
<E^<E> + <T>. Ih this case the recursive^escenf algorithm, 
would look for an instance of the string^lass <E> at the current 
-position in the input string by fiht trying to find an instance of <E]> 
in the inputtfstring. This is a disastrous situatipn for a computer 
program! The broblem arises because the rule in question is left 
recursive, a ^^M^^ whifchu n6nterminal is defined by , a symbol 

r string thatb^ ■ -V ■ 

There are ot^V^^rking algoritnms that, can accept left recursive 
- production rules directly. Hpweyer.let us modify pur sample gram- 
mar so that.it still generates the same language land is also in a 



l$$rm suitable for recursive descent parsing. The left recursionyin a 
\I^oduction rule of the form < A> 'r+ <A>x |y may be eliminated tjy 

converting the single rule into a pair of rules' via thp additic^n of a 

new nonterminal symbol; y v '* v 

•• \ ■ ;, <A> y y<A'> . / - \ "/.>:. ' r 

: ' ■ <A'>*->x<A'>inuii, ■'/,;/. ' 



\ - 4 



where "x" and "y" are arbitrary strings and "null" represenjs the 
emp^y string. A convenient iteration form that is equivalent to this ; 

pair^s\ ^ . ' "y v '; 7 : , ; •'•'.;>.. 

- •% ■ *' ■ 8 ;y <A >^ x * • / > \ ' •■ : 

where iht braces { } indicate thatr th^' inc may be k 

repeated zero or more. times. All tliree formulations produpe exact- 
ly the same\languag£, the*set;of stririjgs {y ? ;yxlyxx, ... .. }. 

P4 is a version of the sample grammar, as mbdified b^ removing 
left recursion^, that is 1 suitable for a recursive descent recognizer: * 

• ' (1) <A>-><V> = <E>. • . 

, (2), <E>-> <T>{^<T> } > * 

' ' V 0) <T>*-><F>{ / i<F>.} , 

■ s (4) '•<F>->«E»|<V> * ; v 

.•/ ; -v (5) <v>— aibici ... i viz ; 

Figure 6 contains amowchart that implements the recursive descent 
recognizer, for this version of the sample grammar. A trace of the 
operation of this prdgram is presented in Figure 7^ after code gen- 
eration capabilities have been added to the recognizer. The dia- 
gram, hbwever, does not contain/all of the necessary detail:^ is the 
procedure that checks to see if me current input symbol is a letter ; 
NEXT is a procedure tb move on to the next input symbol and to - 
store thai symbol in the variable IS ; °ERROR 'is a procedure that 
issues a message when art' e;ror has been found in the input string 
and/also terminates execution, since the parse tree cannot be con- 
structed. ; A ■ . ' ■' 

yA3*' Semantic Specification. The construction of the parse tree 
for%a statement is sufficient to show the validity of both the state- 
ment and its organization into phrases. However, parsing does not 
produce execufable^instruetiqns— we must provide a "meaning" for 
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each of the phrases to be found in the Iatfguage;For example, is the 
plus sign a unary operator or a binary operator? If it. is binary, 
does it refer to integer addition, floating point addition, matrix 
addition (integer or Yral), or {ferhaps tKe logical OR operation? ? 
Such a mjpaning mustjbd supplied, for every symbol used, in the N- 
language 'So that the compiler may 'generate the appropriatevcbde 
for the local context of the symbqiriV ot ationally/we could pwrnde 
an escape character which would allow us^to intersperse semantic 
commands v^ithin tije" productidn rules. Then, as a production rule «'.;. 
is applied during the parsing of the input string, the semantiS>com^ * 
mands Would indicate the semantic routine that should be appifera 0 
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' . 'Input Strtpfc ■ Pijograa Execution , " Parse String •' ' ' . Coda 

; 5c-A+B.. ■ ...Call A../;- ; [ '[.'., : ■ 

V V . •■■ "'■ Call V /■' ' V ( - ' " "; 

-•^A+B . return f roa V ' <V> W 

»a+b- • Call b : :■ //. • <v>(x;- 

':' ••: .' '.' •: ■ . ■ ■ call t / . . . */[■■■ ' 

. •! '"■ ca.ii f ' r \ ' 

;;: ; ■ .J.. :> ; IS-'CT. no . . 1 .' 

V.V' ' £all V .. f ' 

+B ' V ' "' j*-' * return fron'V <V>'(X)-<V>(A^ 

,/ : 1 return from F <V>(X)-<F>(A) 

•; ; • ' i' // . • ." /- is-* ** ? no •• ■ ■ b ,\ 

^ ■/ -V . return f roa T <V> (X)»<T><A) | 

v " ■ ; V ' isv+^ i jre. ; ■■ ■ '■■ ' : 

*» ' \ ., Call T i| I <V>(X)-<T>(A)+ 

' ' •• ' * ^r ; '^^"?'.'Call F.' 



o 7 ' 



7 ' 



, return f rota V <V> (X)-<T>(A)+<V>(B)- 

• return from .F <V>'(X2-<T>(A)4<F>(B) 



i 



v . . V Return from * ' , '*<V><X)*<T>(A)-H:T>(B) 

* IS^+'T no **, . <V>(X)-<T>(t 1 ) * > LDA A 

V • : ./ '■ •• .'- - ■ ADD B 

; V* ■ ' i r •;.•«• . ' * • * k * STA t, 

\ : ( return fyn K ^ ^ ' ^(X)-<^>(t 1 ) 

' ' ^ — ' , " return from A i ^ <A> . .■ * . LDA 

. • ' ' ' . ' STA X • 

v Fig. 7. Recursive descenjf trace and code prdductioif. ^ 

at that.'pdint in the code generation process. This is the technique 
Jfhat will be used in a later e^sCinpIe. An extension of thi& concept 
. offers evqn .more flexibility : provide; the capability of introducing ^ 
the settian^c routine code ^iractly into the production rules by? 
ailoVvlfig a complete block ot code, to be added 'to each rule. This - 
co/ie block then provides for^vfetever Cocle f generation actions are 
necessaryJThis^is the approach used jn the YACC [11] system, for 
^ exampje, to implement a number ^ jiigWy successful compilers. 

The following example illustjafes how a simple set of n\achine^- 
instructibns ftiayAbe produc&Wer the sample Arithmetic assignment 
language. T^e development fs based on a variation of the methods, 
d^pbed Graham [13]. , , / \ ' - 

^Fh;st, let-ds assume thatjhe target computer cpdg, /similar to that 
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discussed in Section 2.2* consists of ; the instructions listed there 
plus: ; ' .; , . ■ , r ,\. , } t ... J,., 

^ MPY X == Multiply the contents of the accumulator by the con-' 

/ . tents of memory location X and leave the product^ 

in the accumulator, ;."■/ ' t 

s * .• ■''»' * 

We also nfced to extend the production rules in two ways : • 

1, Add the escape, character V to flag semantic commands. 
These will have the form u .n" to indicate the execution of the 
nth semantic routine. r > 

2. Associate each phrase name in the parse tree with a symbolic 
address that will eventually corresponds to the location in 

. memory that contains the value of the phrase. Thus % <V> 'may 
be associated with X, and <E> associated with T3, the third 
Vgenerated^ temporary symbolic address./The ' generated ack 
& dresses will be needed to hold values during the evaluation of. 
an expression. ■/. ' ^ V 

*Thc final grammar for our example language including the sem- 
antic calls: 

- \. - P5: (1) . <A>-;<V>=:<E>.1 ,/ 
* M (2) <E>-<T>{ + <T>.2} 7 y 

' ■ " '(3) . <T>~v<F>{*<F>.3} 
' V > • (4) <F>-«E»|<V> / ■ 

(5) <v>->a|b|C|:..|z 

\yhere tfre indicated semantic routines must produce the* actions : . 

.' - * ■ -.- •• ----- - _ - ----- ■ ■ . > 

.1: Generate the^ code to store the value of the right hand- ex- 
pression in the-fefr-Side variable. ' V * 
.2: Generate the code-to add the tWo terms together and to store 

the result in a new temporary location. 8 
.3: Generate4he code to multiply the two factors together and to 
* ■ store the result in a new temporary location. ^ 

In all cases the last symbolic address used or. generated on the right 
side of the production rule should -be associated with ^the phrase 
name on the left side. Figure 7 Semor^trates this process by^llus- k 
trating the step-by-step procedure for 1, the development of a parse 
tree and its .associated generated code. c * 



4.4. Code Pptimjzation. The cpde generation example of the 
. previous ^cotion 1 suffers lYoirh' two ftiajqr deficibncick-in oHhc 
required performance of real cqmpilcrs^FirSt^ it assumes a compu- / 
fational v mac^iji? that is far simpler than moaft current computers.;. 
; ; The .semantic fo^utihes; Wist be augmented significantly to account '< 
for, hems sjich/as multiple accumulators and different modes of 
addressing. Second, the generated code contains a number of in- 
')' striictSqns that' are actually unnecessary. The redundant instfuc- 
tiops were^cre^ted because of Jhe treatment of each phrase as an 
independent entity irrespective v of the context in which it appeals.;.;' 
This latter situation may be improved by applying the techt^iqucisv 
ef code optimization to the co,de as produced by the semaiitic 
^rdatines. We ijind, ljowover, that when we look at the subject of 
!|fcode optimization, it immediately splits into two distinctly different 

1 1. Local Code Optinfiiatign— improving the code efficiency by 
^* working withih^ the jmmediate local Context of the particular 
phrase being parsed, essentially within the confines* of a single 
statement. 3 • "V .' ■ " t v .v '' 

2* Global Code Optimization — imprcnrtrigTlie code efficiency, by 
0 looking at the interaction between statements For example 
by not repeating the evaluation of a duplicated expression orS 
by factoring loop-independtent expressions Initsideithe loops 
in which they appear. ^ a I ' , T 

The code as generated for the example in, Figure 7 was: 

- * "* lda h J--'- ■ 

ADD B, 

• ' C ..; . ST A TL v 

V- LDA Tl ^ 

. STA x v^;..;.\ ■■. 

The value of the sum is available in the accum^tor when the-time 
comes jtp store that value in X. Conse<Wntly, the *LDA T17 in- 
struction is clearly unnecessary. Is the /SJ A Tl' r mstruction also 
not needed? If the original source langj/age prefgram never used the 
value again, then "there would be r>£ advantage ' to saving it. The. 
reason for storing any value is tp-^ve it for later use, thus making 
it unnecessary^ reevaluate the jsame expfessiop. Con^der'how the 
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;•. * ;" ; , .... . ' i"-. *i 7 1 j j ■ \ 1 1 '•' 

following program figment could beAiandled by; ;an optimizing 
-compiler: •;• JyJ) :-»-.■.•:-.- :V.i i: ,:;j . 

Now ft the' tempbrary gavifijg of fi£e suijrt could easily produce the 
<?ode V • I ■ W ' 




with d cptresp(>irtlingly significant improvement )n the object code 
> efficiency.- ; '»/ ■'- 

The theory of fprmAl language recognition 'has been studied for , 
some time, and it is nbw possible to pro'du3fe software tools that 
provide; most of the' d&tail work necessary to develop compilers for 
new languages. The -problem of'code generation is not nearly as 
clean aiid well defined! Ft is highly dependent on Jthe machine and 
language, and hence has been solved by using many heuristic and 
>ad hoc metlhods. Graham [13] describes many of the local xode 
optimizatioa techniques used in the? early compilers. Lbwry and 
' Mecflock [14] describe the globai code optimization rtie&ods used . 
in tlie .iBNt FORTRAN H compiler, a program that produces a 
: very ^efficient object code,., Aho and Ullman [1(1] describe more 
recent developments, including some of the efforts being made to 
"systematize the fkild. • 

v : ' ^ * • ■': ' ' ' ,<. ' s ■ . ' • , : 

• 5. PROBLEM-ORIENTED HIGFJ^LEVEL LANGUAGES . 

■ ..' • ■ \ * * ■ - '• y ,' r : * 

^ While general^purpose high-level languages Y^re being' devel- 
oped; another paralldP stream ipf language development was taking 
place. Computer users with specific requirements tended to create 
'lajiguagds* that were more' oriented toward their particular prob- 
lems. Some of the characteristics of these problem-oriented 
languages are * sufficiently diffenint from the general-purpose 
procedure-oriented languages thaft we shall discuss a representative 
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%{' ■ ' " '* • " ' 

sample of thejm in a number of separate application specific Cat- 
CgOriCS. , \ . • . ' ■ - ■ . , : 

A 1. Report Gcncraiori. RPG (Report Program Generator) [15] 
is a'spccific^langUage jthat, is' widely used oh small computer-sys- 
tcr#s.*^Ve shall use ft* as "an example of the general -class of 

>*languages s devoted to writing reports. The basic assumption in a 
repori generator' system is that a series of records will be read 
sequentially frbrii one or more input files'; some information will be 
extracted from L fleeted set of these records, and then one or more 

/ .output files, ak>h&with a final printed summary, will be produced. 

^s a result, tHe overalLJogifc of the progrsflh inbuilt into the system 
and' the pro^rammetv does not need to specify the exact sequence of 
steps; rathefr, he indicates what operations are' tp be done. IpVact, 
one; pf th£ main concerns in the design of procedure-oriented 
languages (the definition of control structures for flow ofc'biitrol, 
specification) is genefthly of no concern in a report generatioh 
language This typi of information is completely defiried within the 

... languiige compiler; */ \ * T " ■ „ 

To simplify the^durce program preparation further, a fixed 
format of some ty^e usually is specified for the source language. 
This may be se£n most easily by studying a few of the common 
. RPG statement types: . c, \ / y < -\ 

1 1. Fjte Description Specification— Declares- the names and 
^ x^hysical descriptions oTall files that'are used in the program. - 
fV2. Input Specification— Declares thfe names and attributes of all 
items of information that are read from the input'flles. 

3. Calculation Specification— Describes the calculations to be 
applied to the input data. ^ ' 

4. Output-Format Specification— Describes, all output infor- 
mation to be feroduqed, including formats and file^des- 
• ignatiohs. . ' -V V \ 

The information fields on. a file description specification include: 

coliimn 6: *F, to indicate statement type. 
* "\ 7-14: filename. r c , % 

t 15: indication if input, output, or update file. • 
* 24-27: number of characters in ,e^cft record. 

^0-4i6:'i>hysifcal device holding file (tape, disk, etc.). 
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A specification ,f or a file of input punched cards wpuld be: 

F INPUT ^ I ' 80 ..- READ01 . _ 

( Tfiis statement describes the'File named INPUT as an Input file 
consisting* of 80 character records (one* 80-column punched gard) 
on the card reader READ01. , * 

The, information fields of an input specification statement in-^ 
elude: ■ k . 



column 6: T, to indicate statement type. 
** '744: file name. „ v ; \ ' . 

19-20: .record identifying indicator. i 
i^iv>?* s * ' y - . . 24-41 : record identificatipn codes. 

44-51: field location, \ 0 
' / ; ■ ■ "* 53-58: field name, V „ ■ 

This/ statement introduces a fundamental data type in an RPG 
j program^ the indicator. Each indicator is identified by a two-digit 
code (01 -98). and represenfs a two-positioh switch (on/off). The in- 
dicators are normally turned off before an inf)utjrecord is read. The 
record identification code, detailed in columns 21-41, will turn on 
the appropriate indicator if the* data in the record meets the speci- 
fied condition (such as "therd must be a 'D' in position 10 of thp 
input record"). If the conditions are satisfied, the variables listed in 
the name fields also will be assigned values from the positions 
specified by the location fields. The lines: 

v HNPUT 4 14 MO CD 11 30 NAME 
I 40 60 ADDR 

will turn 'on indicator 14 if the contents of position 10 is the 
Character 'D\ and also will store p*6*sitions 11-30 in the variable' 
NAME and positions 40-60 in the variable ADDR. The output 
. specification-' is similar, but with the added complication of exten- 
sive editing and formating capabilities. 
J The calculation specification includes: 

column' 6: 4 C, Jo indicate statement type. 

7-17: Indicator logical expression. * 1 

18-27V Factor 1. - 
28-32: Operation. > , 0 " 

33-42: Factor 2. 

43-48 :~Rfesult field, \ . 
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Thc^indicator logical expression evaluates to "yes" or "no" b^scd t 
on the settings of the indicators (such as "if indicator 20 ijkon' and 
indicator 22 is 4 off, s then 4 ycs/ otherwise W"). If the answer is 
4 no," the balance of the statement is Ignored. If "yes ".the statement 



\ is c\ccutcd. For example, 



" • ' " " RATE MULT pVERTM SAVE" \ 
ill multiply the two factors RATE and OVERTM and store the 
result in SAVE. 

The execution of an RPG program consists of reading an infiut 
record, setting the indicators, doing what is desired based on the* 
indicators, and then repeating this cyclc'until the input data have 
been exhausted. The simplicity and directness of this approach has 
been so successful that the language has expanded far beyond what 
this short introduction can describe. ' 

5.2 Database Query Language. The ability of computer systems 
to store largc\amoUnts of information has affected all of us in many 
Ways. However, just storing information is not sufficient to make it^ 
useful— we mus^ also be able to locate 4t again when we need 'it. It 
is tlje function o£a database query language to allow a user that is, 
not a computer specialist to access and manipulate the information 
stored within a daltabasc system. As a representative of this class we 
shall look briefly arSEQUEL 2 [16]. Wiederhold [17] contains an 
extensive list of references to additional systems of this type, as well, 
as a detailed discussion of the data structures required to support 
such a language system\ ' 

SEQUEL 2 assumes that the database has a relational [18] form. 
That is, information is\tored in files that have a tabular or- 
ganization: 
EMPLOYEE 

Name- J deptno salary jobtitle 



John Jones : -40 \ 15000 Programmer s 

i » Shirley Srftith : 20 V 16000 Mathematician 

The namq of this relation is EMPLOYEE, with the column names 
(or attributes) NAME, DEPTNO, SALARY, and JOBTITLE. This 
descriptive information about the relation may be summarized asf 

EMPLOYEE(NAME,DEPTNO.SALARY,JOBTITLE). 
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Each line in the tabic is called a tuple. A tuple represents- one 
instance of thc'rclation, with each tuplq assumed to be unique^ The 
tuples are unordered in the relation. Normally a database would 
contain many such relations in order to express all of the desired 
information. , 

The basic operation of th^SEQUEL 2 language is a mapping, in 
which a knovyn quantity is transformed into a desired quantity by 
means of a given delation. The structure of this mapping statement 
- is ; • , , 

SELECT <list of attributes to be returned) 

FROM (name' of relation to use) . * * 

WH&RE (predicate .determine what tuples tb seject) 

Thus to find the personnel list of department 20 wc could use the 
query: " 

SELECT NAME FROM EMPLOYEE WHERE 
DEPTNO = '20\ 

> Wc could obtain a list of all mathematicians making more than 
$1 5,000 by using the query: 

SELECT NAME FROM 

EMPLOYEE WHERE SALARY > M5000* 
• \ AND JOBTITLE = 'Mathematician'. 

A relation may *b<r viewed as a set of, tuples. Since a mapping 
returns a desired set of values, a mapping may also be considered 
as producing a new relation as its output. Thus the simple 1 
SEQUEL 2 mapping operation may be easily extended by nesting 
mappings and by using set operations such as unipn and intersec- 
tion. To illustrate these extensions,* assume that our database also 
contains the relation: 

DEPT(DEP v TNO,LOCATlbN,MANAGER) 
• % 
We may find all of the people that work in St. Louis by using the 

query.: *i ^ » 

SELECT NAME FROM m 

EMPLOYEE WHERE DfePTNO = 
SELECT DEPTNO FROM.DEPT 

WHERE LOCATION = 'St. Louis'. 

: , m ' 
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What if we were interested in the names of all of the department* 
managers that had salaries greater than $20,000? This information 
could be obtained by llnding^both the set of all managers and the 
set of all employees with the desired salary, and then determining 
the intersection of these two sets: „ . . . 

SELECT NAME FROM EMPLOYEE 
WHERE SALARY > '20000' ' 
r INTERSECT 

SELECT MANAGER FROM -DEPT. 1 . 

Since the, hist mapping docs hot contain a WHERE clause, every 
tuple from the DEPT relation will be selected, thus producing a list 
of all of the managers. 

There arc many additional features in the language ranging from 
simple query facilities easily learned by nohspCcialists to more com- 
plex facilities intended for professional programmers. Pfelnctions 
such as AVG, SUM, COUNT, MAX? and MIN arc provided for 
application ,t6 the set of values found by the SELECT statement* A 
relation may be partitioned into separate tuple sets by the GROUP 
command or sorted by the ORDERifcommand. 

Relations may be created, modified, or destroyed. The user who 
creates a relation is fully and solely authorized to perform actions 
upon it, .but may" grant access rights to the relation for other users; 
These access rights include, among others, the ability to read, 
insert, delete, arid update tuples in the relation; This'facility tfllovys 
the owners of the information total control (hopefully!) over, who 
^may have access to it — a critically important fcaturfe forlany data- 
base system. . < 

. S3. Graphic Languages. .The abHity to sit at a computer terminal 
that contaipsia graphics display tube has created a new world of 
possibilities fdr the mathematical modeling of curves, surfaces, and 
nfigures. LG- [20] is a language that allows Ihc definition of gco- 
Imetric objects and elements, computes their parameters, and dis- 
plays the results. It was designed specifically for use by non- 
programmers, being easy to learn and vcry^ close to the natural 
language used in geomfetry. 

LG is a conversational language, providing for a user dialogue 
with the t graphic terminal by means of a command-answer type of 

107 . . ' • 
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processing. A command is given by the user, and the system re- 
spondji with two forms: the computed standard-parameters of the 
desired geometric clement and a display of the element itself on the, 
screen.' , + * 

The LG Instruction line that establishes what the user wants 
' done consists of thrc£ parts separated by delimiters: 

* <command>; <name>: <dcfinitioa>,* 

* • * 

4." <command>— ValifJ commands arc either built in by being 
defined at system generation .time or added on-line by the 
user. Marty of the standard types of gbemctric items (POINT, 
LINE, SPHERE, etc.) would normally be built in, as well as a 
number of standard functions to be performed (LOCUS, 
COMPUTE 4 , MACRO, etc), The user may add hcW com- 
' mands at any time by using the MACRO capability. 

2. <pamc>— The name to be assigned to the newly specified gco- 
pnetric clement. » ^ 

3. <dcfinition>^Spcdfics a list of data elements, separated by 
delimiters, which defines the geometric clement. Each element^ 
ii) 1nc list is cither the name of a (ircyiously defined geometric 

■ 1 clement or a keyword set equal to. an arithmetic expression. 1 " 

Sonic sample instruction lines: 

POINT; Pi: X - COS(3,5), Y - L, Z - 2 ♦ ALPHA 

The point named Pi is defined by giving it^X, Y, Z coordinates. X, 
Y, Z arc thc t keywords- indicating which coordinates arc set by the 
respective arithmetic expressions. These expressions may contain 
scalar elements, FORTftAti operators and functions. PI is called a 
fixed element because it has been completely specified, 

* LINfe; L: PI, P2 

The line L Is "defined by specifying two previously defined points. 

COMPUTE; PI: = 4 ♦ ATAN(LO) * ' 
*> ■ • • ' * 

"This command represents essentially a ^standar^ assignment state- 
ment. , 

*■ POINT; - A: L, X = 3 

106- 
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The point £ has been defined as the intersection of the line L and 
the plane x = 3. ; . . x < . . 

f SPHERE; S: R = 1, X = 0, Y = 0, Z = 0: 

the definition of a sphere of unit radius centered at the origin. AH of ;ji 
the examples above represent! fixed efements and henpe ywill *be ^ 
>displaye(T automatically. • .* . * ■ 

Consider the following sequence of commands: . > ; ' . . ' 

PARAMETER^ T : = 0 \ • 
POINT PI: X =*T, Y = 0, Z = 0 V' 
POINT ■■ ; P2: X == 0, Y = T,Z =: 5 V A- 
LINE Ij : P1; »P2 

LOCUS ; L : T, MINH - 10, MAX = 10, STEP = 0.5' 

The first line creates the scalar variable T as a parameter and 
initializes, its value. PI, P2, and L are called variable elements since 
they contain a parameter in their definition. Variable elements will 
be displayed only "on specific instructions, such as the last line 
containing the LOCUS command. The display will be a hyperbolic 
paraboloid composed of the line L drawn for the values of T equal 
to -10, -9.5, -9.0, ... , + 10. The use of 

- TRACE; L 

would display L for the current value of any parameters. A 
parameter value rfiay be changed at any time by the^ommand: 

...A ' ' ; ' REPLACE; T: = 4;.- . ^ . 

The current implementation contains approximately 80 com- 
mand?. There are also many additional ways to define geometric 
elements,, including models such as intersections, transformations, 
arid .alternative coordinate systems. The language has been used in 
studying variations in complex geometric figures, problems in kin- 
ematics arid mechanics, and for some applications in optics and 
tocography. . , 

6. CONCLUSIONS 

Sammet [21] lists a total of 166 programming languages that are 
in use today, 76 in the categories of 'numerical scientific, business 
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data processing, list processing, string processing, formula mampu- 
lation, and multipurpose. More^pecialized application areas (such 
as accdufiting, circuit design, editing and publishing, simulation, 
etc.) cover the remaining 90 langyages.^ 

There are some interesting and contradictory trends taking place 
that help to fl expiain the proliferation of languages. First, it should 
be nbted. that only a small number of the available languages are fn 
widespread use. There is an important economic motivation for 
this: We do not: want to reprogram a problem every time we want 
to run it bri a different machine. If we use a high-level language 
that is machine independent, then we may run the program on any 
computer that has a compiler for that language; Thus, this 
portability, issue argues strongly for a few powerful; standardized 
languages that are implement ed ' on as many different computers as 
possible. Since programs that are widely distributed are frequently 
production-type tas^s (everybody runs a payroll program regular- 
ly!), it 'is important that the Compilers produce efficient •object code. 
This, in turn, means that the compilers themselves are large expens- 
ive programs; and hence tend to be restricted, in numbers. We now 
have standard COBOL, standard FORTRAN, and a few other 
languages either already standardized or being studied in prep- 
aration for standardization, v - 

On the other hand, the software tools now exist that make it 
possible to desig^and implement a language* quickly. The price 
paid for a fast implementation may be slow execution, as compared 
to what could be achieved with more effort spent on code opti- 
mization for the particular hardware in use" However, consider that 
in a research environment, say where language constructs are being 
studied, code efficiency may not be really important. Also, as hard- 
ware costs decrease and programmer, costs increase,^ it becomes 
more important to concentrate on human efficiency (i.e., language 
and system design) than on hardware efficiency. This is particularly 
true for programs that are to be executed just a few times, since 
niost of the costs will be concentrated in program preparation and 
testing. -, 

It looks as if both trends will continue in full force: A few very 
general languages will be fixed, by standardisation to achieve a 
maximum degree of portability and efficiency. Tfifera also will be a 
continuing stream of new languages, f>oth general purpose and 
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problem oriented, as we attempt to make it as- convenient and as 
easy as possible for everyone to use a computer. B 
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SPECIFYING FORMAL LANGUAGES 
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1. INTRODUCTION - 

\What is formal language theory? Theje^are several'themes that 
provide partial answers to this question and that together provide 
a first approximation to the answer. 

(1) Mathematical models for. natural language provide a formal 
basis for studying the nature of languages such' as English and 
French. It is important to have such models if linguists are to 
understand the common features of such languages and their gen- 
erative structures, and if automatic translation between such 
languages is to be achieved [7J, [19]-[21], [23], [60]. 

(2) The essential features of the syntax of many programming 
languages can be faithfully modeled by means of context-free gram- 
mars, the same structures that have been used to describe the 
syntax of natural languages. By using context-free grammars to 
describe the syntax of a programming language, tools for trans- 
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lating between soucce programs and machineVcode have been de-^ 
ve!oped ; and 4 the automatic generation of c^tain portions of com- 
pilers has been facilitated PT, [18], [513, (ftj, [94],;[95]. r - 
^(3) Abstract automata provide mathematical models for. certain 
aspects Of the. pfocess^of computation e.g., there are Turing- ma- * 
chines, devices which capture- the 'notion of computation 3s 'de- 
scribed by. logicians, and- there are. jinite-state machines which- 
model the behavior of swifching cirj^its. Other devices, such as 
pushdown store transducers, ardf used to model aspects of the ^com- 
piling process. Formal language theory provides tool^for "describr 
ing preci^y the power and the limitations of such models and thus 
is extremely useful* in relating studiesin computability theory, 
automata theory, and computational Complexity [2], [28], [37], 
[41], [58], [67M81], [82]. 

(4) Formal language theory as an abstraction of many aspects of 
real computation is a new brafrcli of mathematics relate^ to logic 
and to combinatorial algebra [29], [96]. 

In this paper, language theory is presented from the standpoint 
of (3) with the understanding that (1) has provided much initial 
motivation, that (2) has provided motivation and applications, and 
that (4) has not yet„ fully emerged. It is the use of formal language 
theory as a tool for investigating a, wide variety of topics within 
theoretical computer science that has brought forth the rich family 
of generative and automata-theoretic structures and the. many 
properties of languages and classes of languages that have given 
structure to the theory. Thus the classical connections between 
generative structures, automata, and properties 'of languages and 
classes of languages gre surveyed here with the intent of exemp- 
lifying for, the nonspebialist the questions asked in the study of 
formal language theory. 

A language is a set of strings of symbols where the symbols are 
{aken from some finite set, the alphabet; A language may have 
structure becbuse its strings pre related by some common property, 
e.g., {w e {a, b}* | the number of a's occurring in w is equaTto the 
numbef of b's occurring in w} or (we {a, b}*| the number of a's 
occurring in w is equal to the number of b's occurring in w and fSv 
any pfefix y of w, the number of a's in y is greater than or equal td 
the number of b's in y}\ A language may be finite or infinite. In the 
case that a language is infig&e we encounter the notion underlying 
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-muon 6f the theory: ToMi/fctfss an infinite language in\erms*ofy 
Effective computation, the language must have tome finite repy 
resentation; aod this^repr^sentadon fim$t give information about 
t^e language thatjvill allow one to construct effective procedures 
for generation or acceptance. / * ' ■ " - M " 

There are tl^ree methdids that' have • 6eeri ^ged extensively for 
< Tmitely tT Specifying,language$. 

(i) Define a finite structure v that generates the strings of the 
language. Thus onQ must spfcify an effective 'procedure that enu- 

-merates the language and nothing outside of the language. The 
most familiar example of such a system is a phrase-structure gram- 
mar and the notion of derivation in such a grammar. Generative 
structures are similar to formal systems a$ studied by logicians! j 

(ii) Define a finite structure that recognizes or accepts the words 
of the language.. Examples* here include: finite-state acceptors, 
Turing machines, pushdown store acceptors, etc. Generally, the 
device examines an input string and computes the characteristic 
function of the language, giving an answer "yes" or "no" to the' 
question "Is w in L?" ' . ' 

(iii) .Explicitly specify a language by definirig it from given 
languages and certain operations on languages. . 

The class of context-free languages lies at the core of the theory 
of formal languages, and so it is useful to. explain just how context- 
free languages can be" specified. In keeping with a language is 
context-free if and only if it is generated by a context-free grammar. 
(These notions are developed in/Section 3.) In keeping with (ii), a 
language is context-free if and only if it \s accepted by a nondeter- 
ministic pushdown store' acceptor. (These notions are developed, in 
Section 4.) In keeping with (iii), a language is context-free if and 
only if it can be obtained from the Dyck set on two letters by a ; 
finite number of applications oNhe following operations: intersec- 
tion with regular sets, inverse homomorphism, and homompr- 
phism. (These notions' are developed in Section 3.4, particularly 
Theorem 3.14.) » ^ 

Just as individual languages are specified by grammars, auto- 
mata, or algebraic characterizations, classes of languages may be 
specified by considering all languages generated by a specific class 
t)f grammars or all languages accepted by a -specific 'class of 
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automata or all languages defined algebraically from a certain basis' 
class oflaftguagep * ■• ' - * '• ■ ^ v 

' Jn this papers the class of context-free languages is' used* jto 
^exfcmplif/lhemesMhat arise in formal language theory. Extensions 
and" restnctions of this class are described and the study of 'com-' 
plexity classes is introduced However, this, overview is notijan 
' e?i|(jaustive* survey. The topics mentioned here are central to the 
. StuHy of formal language theory, but the topics* omitted are^tpcj 
numerous to list in this brje£ papeuA; . \< i H . tfe ! * 

The practical role of context-free languages in compiling, a pro- 
gram (written in a context-fsee language) is discussed in, the pr;ecbd- 
ing article by Ball. ' .^ ff\- 

The primary references given here Ave, included because\gftheir 
historical interest, because? they, havl piayed A.key^ role in th&devel-* 
opment of the subject, because the material included isj^hot de- 
scribed in ahy secondary source, or because they give the 'flavor of 
cu$fent work. Many secondary references are included so that the 
•reader has a chance to learn sdme of the 'basic matepal ^bdbre 
plunging into research papers. „ .'■ . 

Finally, it sjiould be noted that the^ popularity ' of fofarnal 
langyage theory has fluctuated greatly. It is my contention th&t 
many of the themes in formal language theory are central to theor- 
etical coj^puter science and ttiat despite these fluctuations formal 
language theory will continue as a lively and'vitai area of research. 

2. PRELIMINARIES * '■' V 

In this section we establish natation and state some results on 
two fundamental classes of languages, the regular sets and the re- 
cursively enumerable sets. 

2.1. We assume as an undefined'primitive the notion of a symbol 
A string (word, sentence) is a finite sequence oj" symbols and the 
empty word e is the empty sequence. The length of a string w is the 
number of symbols \n w; it is denoted by=| w | . Thus | e | = 0 and, if 
w is a string and a is a symbol, then |a| = 1 and |w&| = |w| +. |a|. 
The basic operation on strirtgs^sed here is concatenation*, the 
concatenation xy of strings x and y is the result of juxtaposing x 
and y; and so | xy | = | x | -f- 
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A finite set ot^M^q^ is ?n alphabet {vocabul&y). If Z is an 
alphabet, then; Z^jplie sei of all strings over Z, S* = {aj • • • aj, 
?ri > 1, each a ■{*}., A^y subset of Z* is a language over Z; 

/ . Thus, to sayit^'t L is a language is to say tha,t there is some 



We.will^ on languages. First,* there 

are ths^jMolean operatidns; union, intersection, and com- 



ft plementlifenV Second, there are' the. Kleehfe operations: concat- 
\ eriatton^uhe concatenation of languages J^i and L 2 is; LiL 2 = 
■(v/MWi e L x . and w 2 e L 2 }), v Kleene * (L* = (j£ 0 L ! where 
L 1 = W and v L n+1 = LL n ), and -Kleene + (L + = 

X^W^ is an alphabet, thenjZ* is the free semigroup With identity 
(00§emonoid) generated by 't. Here the .semigroup multiplication is 
^^icatenation ;of strings, an associative binary ppefation. The 
m&hpty word e is the\identi|y. Since eyery w e has a unique 
fflfactorizatipn as a finite sequence of symbols from Z, Z*<is the free 
semigroup with identity generated by Z. For a language Lg l* 
. lL* is the subsemigroup (of Z*) with-identity e generated by IT. 
^ Another operation on languages is reversal If Z is an alphabet 
§ah(i w g'Z*, the reversal w R of w is defined as follows: w R ,= w if 
v'w e Z u {e} ; w R = ay R if w = ya for y e Z*, a e 1. If L £ Z*, fherf 
i the reversal of L is L R = {w R | w e L}. 

, An important consideration in formal language theory is the 
study of several types of mappings between languages. The prota- 
ctype of 'the mappings considered in this paper is the notion of 
homomorphism. For alphabets Z and A, a homomorphism h from 
Z* to A* is a function h : Z* — ► A* with the property that for all 
x, y e Z*, h(xy) = h(x)h(y). Since each w 6 .1* has a unique factori- 
zation as a finite sequence of symbols from Z, a homomorphism is 
uniquely determined by defining its values on the symbols in Z. 
Thus we are considering homomorphisjns between free semigroups. 

A homomorphism h : Z*~> A* is nonerasing if for every w e Z*,' 
h(w) = e implies w = e (equivalently, h(a) # e for every^a g Z) and 
is length-preserving it |h(w)| = |w| for every w e Z* (equivalently, 
v jh(a)| =1 for every a e Z). Thus a symbol is "erased" if it is 
mapped to the empty word. Clearly one can view homomorphisms 
as functions taking strings to strings with no thougfit of the simple 
algebraic structure involved. 

' 7 '■ 116 
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, If h : Z*-^*, is* a homomorphism and Lg £*, tfe write 
h(h)^qv '{h(w)Jwe L}. fIf 4 L<=A*. we vyrite h'%) ftir 
{w e 2*| h(w) We refer to.tbe mapping h^ 1 from subsets of 

i A* g to subsets of Z* as.-an inverse homomorphism. ' *1 
' : A hqmo'morphiW substitutes a string for a 'symbol. It is of con- 

- sider^ble interest # in formal Jaoguage theory to- substitute a 
language, for a symbol. J© define thfs notion consider an alphabet 
,Z^ For, each a e let Z a be an alphabet and letT(a)feZ*. Let 
T(e) - {e} and T(wg) = T(w)T(a)7or each a f Z, w e-Z*. Then T is 
a .substitution on Z. If '<£ is a class of languages sfich that for each 
a e Z, T(a) e'^wthen T'is an ^-substitution. A class j£? x " is closed 

\under & 2 >subsmution if fof every L e JS? lf when Z is an alpha- 
bet siich that L c I* and T is 'an .S? 2 -substitution on Z, then 

One -of the important themes in the study of formal languages 
and abstract automata- is the characterizati^ ;pf classes - of 
languages and automata by means of operations under Which the 
class of languages is closed. Recall that if 8 is aanrary operation on 
^languages, then a class .S? of languages is closed under operation 
9 if for every choice L ; , L n oFl^jigu&ges in,&,'6(L u t n ) is 
in <£. '•' ■ 

2.2, Now we turn to the notion of "regular sets," the languages 
recognized by finiterstate machines operating as acceptors. The 
study of finite-state machines as devices to compute functions arises 
in the theory of switching circuits and the study of. logical circuit 
design. The restriction of this study a to machines that compute only 
characteristic functions is one of the most important areas of the 
theory of abstract automata. 

A deterministic finite-state acceptor D = (K, Z, <5, q 0 , F) has a 
finite set Kj>i_states^an input alphabet Z, a transition function 
5 : K x £-+ K, £n initial- st^ q 0 e K, and a set F of accepting 
states, Fg K. The transition function is extended to 
<5* : K x Z * -> K by defining <5*(q,e) = q and— * .. 

5*(q, ya) = ^*(q. y). a) 

for every q e K, a e Z, y e Z*, and since <5* agrees with 5 on K x Z 
the notation is abused by referring to 5 on K x Z* instead of <5*. 
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The language accepted by D is - 

V* " L(D) = {weI*|i5(q 0 ,w)eF}. > f .* 

A language L is a regular seU(regulur language) if there is some 
deterministic finite-state acceptor D such that L(D) = L. ~ 
^Intuitively,' a finite-state acceptor is a machine which Kas an 
j^put Jape (see Figure 1). On the input tape there is a read-only 
j'heaffwhich moves across, the taj^ from left to rj$ht, reading the 
./com^nts of successive tape squar^Jhe set of starts and the transi- 
ticln function represent the "logic" of the machine and may be 
viewed as a program with no variables other than a single input 
.' variable which takes values read by 1 the input head. 

These definitions describe a finite-state acceptor as an extremely , 
simple model of a computer.- The regular sets so described may 
represent sets of numbers -or logical ^predicates or sets of strings 
with a^hiple linguistic or syntactic pattern. In compiler construc- 
tion one constructs finite-state acceptors to recognize portions of 
the input as part of the lexical analysis! Notice tftat the construc- 
tion of such a device requires only a bounded amount of memory. 
k While Jinite-state acceptors are quite Simple, the class of regular 
'sets is rich in structure. *Th is Is shown in part by the following 
result. V 

Theorem 2.1. The class^f regular setk is closed under the Bool- 
ean operations, the Kleene operations, sutffctitutten, rever^il, homo- 
morphism, and inverse homomorphism. 
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Fig. 1. A finite-state acceptor. 
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Let us sketch the constructions useddniproving-that^th^cl^ss of 
regular sets is closed under the Boolean operations. First, -recall 
that the only available method of specifying a regular -set is by 
specifying a finite-state acceptor \vhich acceptTall atjd only mem- 
bers of that set. For i = 1, 2, let ' % + 

■M,-(K lr Z > i|";q |i -F l ) 

be a finite-state accep.tor. Define ' 

I \ S: (K t x K 2 ) x I^K l x K 2 

by ' . * . . 

<5((Pi> p 2 )> a)=(5 1 (p 1 , a), <5^p 2 , a)) for p x e K lf p 2 e K 2 , ael. 

Let \ 

F 3 -(Ki xF 2 )u(F x xK 2 ) . . < 

and let 

. ( ■ F 4 = FixF 2 . : - ; * . \ 

Then \ V * . ♦« ; 

M 3 -(K 1 xK 2 ,I,<5,(q 1 ,q 2 ), F 3 ) 
is a finite-state acceptor such that ^ 
L(M 3 ) = L(Ma.J"L(M 2 ), 7- 

and 

M 4 « (K x x K 2 , I, <5, (q lf q 2 ), F 4 ) * 

■ • ■ . / * - ; 

is.a finite-state acceptor such that / 

/ 

-LtMsHUMOnLCM*). 
Further, if F 5 = Kj - F lf then J 

M 5 -(K 1 ,I,(5 1 ,q 1 ,/F5) 
is a finite-state acceptor such that / 
■K L(M 5 ) = E*-L(Mi). 

If M .= (K, L, 5, q 0 , F) is a given finite-state acceptor and vv e 2* 
is a given input string (see Figure 2), then we can determine whe- 
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Fig. 2. A finite-state acceptor with initial state q 0 and one accepting state, q p such 
that an input string w in {0, 1, 2}* is accepted if and only if the sum of the digits in 
w is 1 (mod 3). . 

ther w is in L(M) by computing <5(q 0 , wj and inspecting F tp 
determine whether (5(q 0 . w) is in F. Further, it is clear that if L(M) 
is not empty, then there is some string w in L(M) such that I w | < t 
where t is the number of states in K. Thus, one can determine 
whether L(M) is empty by checking the finitely many strings in Z* 
with length less than t for membership in L(M). Similarly, one can 
determine whether L(M) is finite by checking the finitely many 
strings in Z* with length between t and 2t - 1: L(M) is infinite if ^ 
and only if there is some w in L(Iyl) such that t ^|w| ^2t- 1. 

From these comments and the facts about Boolean operations, 
one can show the following result. 

Theorem 2.2. For eai^i of the following problems, there is an 
algorithm that provides the solution: 

(i) Given a finite-state acceptor M and a string w, does M 
accept w? 

(ii) Given a finite-state acceptor M, is L(M) empty ? 
(Hi) Given a finite^state acceptor M, is L(M) finite? 

(iv) Given two fihite-state acceptors M x and M 2 , arel^Mj) and 

L(M 2 ) equal? * 
Let lis consider a characterization of the regular sets in terms of 
. closure operations. ^ 

Theorem 2.3. Let Z be an* alphabet. The class of regular sets 
over Z is the smallest class containing the finite subsets of Z* and 
closed under union, concatenation, and Kleenfc*. 
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The characteqzation of the regular subsets oH^lgiven.ih Theor- 
em 2.3 is important since it shows thiat e^ctf regular set can b'e 
described by a certain type of formal polynomial, a "regular ex- 
pression," as well.as by a finite-state acceptor. " ^ * 

If Z is an alphabet, then the class of regular expressions over I, is 

defined inductively as follows: ' 

?-» , 

(i) (A) and (0) are regular expressions, where A, 0, (,) are sym- 
bols not in Z; .; ' 

(ii) if a is in Z, then (a) is a regulaimpression;^ " * 

* (iiij if P and Q are regular expressions over I,- then so are 
(P + Q),(P • Q), and (P*). . , 

The correspondence between regular expressions aftd regular sets 
is clear: 

(i) (X) denotes {e} and (0) denotes 0; 

(ii) for a e Z, (a) denotes {a} ; 

(iii) if P(Q) is a regular expression denoting -the set P(Q), then 
(P + Q) denotes P u Q, (P • Q) denotes PQ, and (P*) de- 

v notes P*. v ' a 7 

Usually the symbol • and the parentheses are omitted' when no 
ambiguity is introduced. , 

- The fact that a set is regular if and only ip it is denoted by a 
regular expression follows from Theorem 2.3. 

' Aether characterization of the regular sets has become impor- " 

tant in the study of the algebraic structure of abstract automata 

and of formal languages. j ' m 

Let Z be an alphabet. A congruence relation p 6n Z* is an 

equivalence relation with the property that for any x, y e Z*, if xpy, 

then for all z, w e Z*, wxzpwyz. A congruence relation is of finite 

index if it has only finitely many congruence classes. 

* 

Theorem 2.4. D Let Z be an alphabet and let Ls Z*. The follow- 
ing are equivalent: . 

(i) L is a regular set; 

(ii) L is the union of some of the congruence classes of a congru- 
ence relation on Z* that is of finite index; 
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(iii) The relation ^defined as follows is a congruence relation on" 
I* that is of finite index: for all x v y e I* x = y if and only if 
for allw^ z e £*, whenever yrxz is in L, then wyz is in L, and 
• conversely. , . * ' . . * ■ 

' 23. At the heart of the study of abstract automata are the basic 
questions of computability theory: What is an algorithm? What 
does it mean to say that a function is 'computable? i&gicians have 
put forth numerous formal models to realize the notion of "algo- 
rithm" the model of most 'interest for tha study of automata^ being 
the Turing machine.- . 
. ,A Turing machine (see Figures' 3 and 4) is a device with a finite 
set of "tape symbols " including a "blank," a finite set of states, a 
read-write head which operates on a potentially infinite tape, and a 
finije set pf possible operations: / 

(i) erase a symbol and print a new symbol (overprint); 

(ii) change state; 

(iii) move right or left one square on the tape or do riot move at 
all; 

(iv) halt. 

How do Turing machines differ from the finite^^e acceptors 
detffcribed in the previous section? First, the head cajhite (change 
symbols) as well as read. Second, the head can nl«Httas;well^s 
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Fig. 3. A Turing machine. 

r '122 



106 



Ronald V. Book 



I N P U T 



READ HEAD 



FINITE , 

STATE 

CONTROL 



READ-WRITE HEAD 



STORAGE 



Fig. 4. A Turing machim 



ine^ith c 



one work tape. 



■ . f' • 



right. Thprdi the head can move beyond the portion^ o the tape 
which initially contains the input and! can write on initially blank 
tape squares. ■ .. " . >f . ," 

For the sake of the? uninitiated reader, one version of Turing 
machine formalism is presented here. V, 

A Turing machine oyer alphabet A is a structure (K, A, T, <5, q 0 ) 
where K is a finite set of states, A 'is. a finite alphabet with blank 
J8 e A, T = A u {L, N, R} v where L, R are three symbols not 
in K u A; 5 is a function from some subset of K x A, into A x 
{L, N, R} x K; and q 0 e K is the initial state. It is assumed that K '■ 
and Fare disjoint; 

The function 5 may be viewed as a finite set of Quintuples such, 
that (q, S, T, Z, p)is in the set if and only if 5(q, S) = (T, Z, p). 9 ' ■ 

An instantaneous description (II)) relative to M is a string of the 
form w t qw 2 where w lf w 2 e A*, w 2 / e, and q e K. s 

Let A and B be ID's relative to M. We say 'A yields B and 'write 
A h- B.if for some S, T 6 A — {/?}, w, w' e A*, and q, p e K, one of 
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the following ^occurs : ' 

;(1) 4s) = (t,n >P ),, 

A = wqSwY.B = wpTw* ; v .. 
(2) <5(q, S) = (Tj R, p), either \ 
]'l - A = wqSw', w' ^ e, and 

B = wTpw.', or A = wqS and.B == wTp/?; 
• (3) £q, S) = (T, L, p), either 

S = wS'qSw', S r e A, and 

B = wpS'Tw', or A = qSw< and B = pjJTw'. . 

An ID A is a terminal (halting) ID if there is no B suc^ that 
AhB. '." ■ . ■' ; ■ 

A computation of M is a sequence of ID's A 0 , A l5 ... such thatjjjj 
either (I) the sequence is finite, its last member A n is terminal, and* 
A,_i : h A, for i = 1, n, or (2) the sequence is infinite and Ai-k 
h Aj for all i 2 1. A computation^ proper or improper according ; 

as (1) or (2) is the case. If A 0 A„ is proper, then A n is the 

resultant of the computation beginning with Aq. 

For ID's Ab, A„ A n , if A,li h A,, i = 1, . n, then we write 

Ao^a,,. . 

It is clear that for any Turing machine M = (K, A, H <5, q 0 ) all 
the relevant information about how M will behave is«contained in 
S. Thus S may be considered to be the machine itself. > 

By considering certain instantarteous descriptions of a Turing 
machine.to be initial and interpreting the tape contents of tenni^aL^ 
instantaneous descriptions, one sees that a Turing ^'machine ^ com- 
putes a (partial) function. It can be shown that the 'class of all such 
"Turing computable" functions is precisely the; class of functions 
specified by Post noqnal systems or by the lambda-calculus of 
Church or by Markov normal algorithms or by the recursive func- 
tions of Godel, Herbrand, and Kleene. - * 

We shall consider a function to be computable by algorithm if 
and only if it is computed by a Turing machine that halts on every 
input. Such^a function shall be called recursive or total recursive. A_ 
paftialTunction computed by Turing machine' shall be called par- 
tial>ecursive. ■■ \ 

Here we shall i>e concerned with the, characteristic function of a 
set.. A set is recursive if its characteristic function is total recursive. 
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A set B is recursively enumerable if there is a partial recursive 
function f such thatif(b) isi defined and equal to 1 when b is in B 
and fl(b) is undefined or is fequal to 0 when b is not in B. Thus we * 
may consider Turing machines with certain distinguished accepting 
states so that a recursive set is the set of inputs accepted by a 
Turing machine that halts on every input, and a recursively enu- 
merable set is the set of inputs accepted by a Turing machine that 
may not halt on some of-its inputs. > \ 

A question or problem or predicate is said to be decidable (solv- 
<able) if there is an algorithm which will provide the correct answer 
to every instance of the question; otherwise the question is undecid- 
able funsolvable). Generally; we discuss questions with "yes-no" or 
"0-1"^ answers when we consider its decidability. To say that a 
problem is uncjecidable is to say that there is no algorithm that will 
compute its solution on every input. Of course this means that 
there are infinitely many instances of the problem. " 

Note that a Turing machine is a finite object. Thus by defining a 
.Turing machine one finitely specifies the set of inputs accepted by 
the machine even- though this set may be infinite, and so one can 
"name" a jecjArsively enumerable set by specifying a Turing ma- v 
chiHeThaa accepts all and only members of that set; Since a Turing 
machine is a finite object, it can be encoded as a string of symbols 
in a certain form or as an integer. Such an encoding, usually called 
a "Godel numbering," provides an algorithm to map the descrip-. 
tion of a machine M = (K, A, T, 5, q 0 ) to its encoding and also 
pjpvides ah algorithm to produce the description of M as a set of 
quintuples from its encoding.; This allows one to enumerate the 
class of all Turing machines or, equivalently, the class of all recur- 
sively enumerable sets. Using such an eriufrieration, one can con- 
struct a "universal Turing machine," a Turing machine U such that 
on inputs e and x, U simulates the computation of the machine 
with name e on £he« input encoded by x. Further, using such an 
enumeration asywell as a diagpnalization, one can obtain a basic 
result of computability theory. 1. 

Theorem 2.5. There is no algorithm which when given a descrip- 
tion of a Turing machine M and an input x will answer the 
question "Does M halt when started on input x?" 
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This question is known as the "Halting Problem for Turing ma- 
chines" and Theorem 2.5 can be restated as follows: The Halting 
Problem for Turing machines is^undecidable. An equivalent formu- 
lation in terms of recursively enumerable sets is as follows : There is 
no algorithm to answer the question "Is x in L?" for'an arbitrary 
recursively enumerable set L (specified by a Turing machine) and 
an arbitrary string x. Phrased in this way we say that the "member- 
ship problem" for the recursively enumerable sets is undectdable. 

There are numerous examples of questions about Turing ma- 
chines that are undecidable. We list several that occur in various 
forms in automata theory: 

(i) Does a Turing machine halt on all of its inputs? (Equiya- 
lently, is a recursively enumerable set actually recursive?) 

(ii) The finiteness prdblem: Does a Turing maphine halt on pnly 
finitely many inputs? (Is a recursively enumerable set finite?) 

(iii) The emptiness problem: Does a Turing machine fail to halt 
on every input? (Is a recursively enumerable set empty?) .. 

(iv) The equivalence problem: Do two Turing machines accept 
precisely the same set of strings? (Are two recursively enu- 

. / merablcsets equal?) 

Throughout theoretical computer science 'the question of whe- 
ther or not a problem is decidable has provided an important 
theme for study In this papeilwe shall use the questions abtfut 
Turing machines stated above al prototypes for the questions to be 
asked about the classes of aut^^ta, grammars, and languages 
under investigation. Notice that we have already done this with the 
^ass of finite-state acceptors in Theorem 2.2 where the answers 
v turn out to be exactly the opposite of those for Turing machines. 

2.4. There are a number of secondary sources that, describe 
finite-state acceptors and regular sets in great detail. In particular, 
the text by Salomaa [81] is very useful. A fundamental reference in 
this area is the collection of papers edited by Shannon and Mc- 
Carthy .[88]. Another useful collection (that is unfortunately out of 
print) is edited by Moore [67]. A^particularly important paper for 
the study of formal languages is that of Rabin and Scott [75]. ^ 

There are many books in logic which discuss Turing machines 
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and other formalisms for studying computability as well as the 
recursive andj recursively enumerable sdts. Books thht place special 
emphasis onvthe study of computability include those by Davis 
[28], Rogers [76], Yasuhara [92], Brainerd and Landweber [17], 
Hennie [55], and Machtey and Young [63]. , 

3. CONTEXT-FR^ GRAMMARS AND LANGUAGES 

The, study of context-free languages is fundamental to theoretical 
computer science. Major advancesnn the use of artificial languages 
such as programming languages as^vell axin the study of natural 
languages came with the realization that fonjial mathematical ma- 
chinery was required in order to generate the infinite set onftrings 
of a language. Historically, the notion of a context-free grammar as 
an important generative structure^^as^developed simultaneously by 
researchers in programming languages and linguistics. ■/ « 

In this section some of the important features of context-free 
grammars and languages are des^fced. This development is carried 
further in Section 4. 

3.1. We .begin by defining context-free languages as the 
languages generated by context-free grammars. 

A Context-free grammar is a structure G = (V, 2, P, S) where V is 
a finite set of symbols called the alphabet or the vocabulary of the 
grammar, Z c V is the terminal alphabet, S e (V — I) is the initial 
or starting symbol (sometimes called the axiom of the grammar), 
and P is a finite set of ordered pairs, P c: (Y - Z) x V*. An element 
of P is a production or rewriting rule and is written Z-* y. instead of 

vz% , ; / . - ; 

T<he definition of a context-free grammar does not explain how a 
language is obtained from a grammar; it defines a cohtext-free 
grammar as a "static" object. To explain the "dynamics" involved 
in the generation of a language, the notion of "derivation" must be 
defined. * 

• Let G = (V, Z,JP, S) be a context-free grammar. Define a binaty 

relation => on V* as follows: for any a, B e. V* and Z-> y e P, aZ/? 

=g-ay/?. For each n £ 0 define. a binary relation =^> on V* as follows: 

For every fleJt*, 0=^0; for S u 0 2 e V*, if 9 X => 0 2 , then 0!=>0 2 ; 
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for 0i, 02, 0* e V*, if 0 2 and 0 2 ~ 0 3 , then Of+O* . Define a 
binary relation 4- on V* as follows: if 0 X , O^TfV* and 0! => 0 2 for 
some n^O, then B x &'0 2 . If % ^, 0 n e V* and' for each 
i = 1, n, 0|-i ^0|, then 0 o ==>*i =g^" is 3 derivation of 
length n in G._ * t - 

The relation =?> is the transitive, reflexive closure of the relation 
=>. We read "0! generates 0 2 " for 0! ^> 0 2 or "0! generates 0 2 in n 
steps" fox 0i=>0 2 - When no ambiguity can arise, we omit the 
subscript Cjr and write " => ( 4> )ibr => ( => ). 

If G = (V, Z, P, S) is a context-free grammar, then the language 
generated by G is L(G) = {w e Z*| S=3>w}. A string y e V* such 
: that S=2>y is a sentential form ofG. 

A language L is a Context-free language if there is a context-free'' 
grammar G such that L(G) = L. /; 

It is useful to represent derivations graphically by meaSs of 
"derivation trees." A tree is a directed acyclic graph with a dis- 
tinguished node, the root. The root has in-degree 0; all other nodes 
have in-degree 1 and are accessible from the root. The nodes in a 
derivation tree are labeled with the symbols used in the derivation. 
The concept is illustrated in the following examples. 

Examples 3.1. * . 

(a) Let V = {a, b, S}, Z = {a, b}, and P = {S— aSb, ab} (see 
Figure 5). The grammar G x = (V, Z, P, S) is such that L(G X ) = 
{a n b n |n>0}. *■ ■ 

(b) Let V = {(,), S}, Z = {(,)}, arid P - {S -> (S), S-+ SS, S-*e} 
(see Figure 6). Let G 2 = (V, Z, P, S). The language L(G 2 ) is the set 
of all well-formed strings of correctly balanced parentheses. This 
language is called the (semi-) Dyck language on one letter. 

(c) Let V = {a, b, S}, Z = {a, b}, and P = {S->aSa, S->bSb, 
e}. Let G 3 = (V, Z, P, S). The language L(G 3 ) = {ww R «| w e {a, 

b}*} is the set of "palindromes" in {a, b}*, the set of strings that 
read the same backwards as forwards. -\ ," 

cdid) Let V = {(,), EJ,.S}, Z = {(,), [,]}, and P={S->(S), 

si. ^ • * ■ ■ . - •'; . 
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Fig. 5. A derivation tree for the derivation S => aSb => aaSbb => aaaSbbb 
aaaabbbb in the grammar Cj, of Example. 3. 1(a). , 

S-> [S], S-> SS, S-> e}. Let G 4 = (V, E, P, S). The language L(G 4 ) 
0is the set of all well-formed strings of two types of parentheses 

which are nested and correctly balanced. This language is called* 

the (semi-) Dyck language on two letters. 
^ (e) Let V = {x, y, z, +, *, (,), S, T, A}, t = {x, y, z, (,)}, 

and P - {S-> A, S->_(T + T), S->(T*T), T-> (T + T), T-> (T*T), 

T-> A, A->x, A->y, A->z}. Let G 5 Z, P, S). The language 

L(G 5 ) is the set of all well-formed expressions over {x, y, z} with 

binary operators + and *. 

Note that.none of the languages in Examples 3.1 is a regular set. 
On the other hand, one can show that the regular sets are gener- 
ated by those context-free grammars that are "left-linear": G = (V, 
E, P, S) is left-linear if each rule is of the form Z->aY where 
Z e y - I,ae Z* and Ye (V - Z) u {e}. 
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Fig 6. A derivation tree for the derivation S=> 0(0)0 in the grammar G 2 of Exam- 
ple 3.1(b). 

Representation of derivations by means of 'derivation trees rep- 
resents the essential property of being context-free. If two nodes are 
independent (neither is the descendant of the other), then the 
subtree rooted at one node represents a derivation that does not 
depend on the derivation represented by the subtree rooted at the 
other. This property is represented in terms of derivations by con- 
sidering "left-to-rigjit" derivations: At every step the leftmost non- 
terminal symbol is transformed. This derivation corresponds to the 
construction pf a derivation tree by always taking the leftmost 
possible branch and adjoining the next subtree to the leftmost node 
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of the current frontier if that node is labeled with a nonterminal 
symbt>I. Defining this' notion formally involves two steps. 

• (i) Let G = (V,S, i P,.S) f; Define a binary relation 4 on V* as 
follows : For Wy^ft-y e V*, and Z e V - X, if pcZ0 => oty0 
and a e X*, then aZ0 k <*y0. 
(ii) Now defined? for each n SO in order to obtain the notion 
of "left-to-right derivation of length n" and define ^> to be the 
transitive, reflexiye closure of 4. (Details are omitted.) 

: The notion of a left-Jo-right derivation provides a normal form 
for derivations of terminal strings from the initial' symbo^ in a 
context-free grammar. This is^een from the following result. 

Theorem 3.2. Let G =;(V, X, P, S) be a' context-free grammar. 
For any p e V*, w e X*, and n £ 1, there is a derivation of w from 
p in G with n steps if and only if there is a left-to-right derivation 
of w from p in G with n steps. Hence, L(G) = {w e X*j there is a 
Jeft-to-right derivation of w from S in G}. 
* 

Proof. It is sufficient to show that if there is a derivation 
r p => . =>w of length n, then there is a left-to-right derivation 
p 4- : • 4w of length n. The proof is by induction on n. 

If p => w is a derivation of length 1, theri there exist a, ft y e X* 
and ZeV-S such that p = aZ0, w = ay/?, and Z-*y is the re- 
writing rule applied in p=>w. Since Z is the only nonterminal 
symbol^ in p, the derivation p=>w is already a left-to-right deri- 
vation of length 1. 

Assume the result for all p e'V*, w e X* and all derivations of 
length no greater than n (for some n 2r"l). Suppose that p 0 =>pi =>* 
... .=>p n =>w is a derivation of length n"+ 1 with w el*. Now 
p x • • => p n => w is a derivation of length n in G so that by the 
induction hypothesis there is a left-to-right derivation p 1= >0 2 => 
... i>O n =^ w of length n in G. If p 0 =^Pi. then p 0 =^Pi =^#2 ^> 
.... i> e n 4 w is the desired left-to-rigty derivation of w from p 0 of 
length n + 1. Otherwise, there exist w x e X* yi, y 2 je V*, andZ t , 
Z 2 e V - X such that p 0 = w^y^ y 2 , p x == w x Z x y t yy 2 , and 
Z 2 t* y is the rewriting'rule applied in the derivation p 0 => p v 
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Since Z { is the leftmost nonterminal symbol in p\ andp x 40 2 is 
the left-to-right derivation, there exists P e V* such that 

O2 - w i^yiVy2 and Z t — > /J is the rewriting rule.applicd in p x => 0 2 . 
Thus, Po^WiZiyiZaya^Wi/JyiZjya^Wi/JyiVyar^ ^, so that 
w 1 /iy 1 Z 2 y 2 => 0\ ki' • -4> 0 n 4-w is a derivation of length n; By the 
induction hypotnesis there is a left-to-right derivation w 1 /Jy 1 Z 2 
y 2 ^ ==>* * * 4r n =^w of/lcngth n in G so that Po ==: W iZiyiZ 2 
yak^ w ii'yiZaya^r2'i ,, *^r il 4w is a left-to-right derivation 
oflcngth n +.1 that begins with p 0 and ends with w. □ 

3.2. Representation of derivations of context-free grammars by 
means of derivation trees suggests certain transformations that 
yield normal forms and grammars. In particular, binary branching 
derivation trees suggest certain restrictions on the form of the re- 
writing rules in the grammars. 

A context-free grammar G = (V, Z, P, S) is in Chomsky Normal 
Form if each production in P is of one of the following forms: 

Z _> a lael, ZeV-I, Y r , Y 2 e V-I~{S}. 
Z-YiY 2 J - ♦ 

Theorem 3.3. From a context-free grammar G^ one can ef- 
fectively construct a context-free grammar G 2 in Chomsky Normal 
Form such that L(G 2 ) = UGJ. 

If G = (V, Z, P, S) is a "Chomsky Normal Form grammar, then a 
nonempty string w is in L(G) if and 'only if there is a derivation of 
w from S in G of length 2|w| - 1, and the empty string is in L(G) 
if and only if S->e is a rewriting rule of G. Thus we have the 
following result. 

Corollary 3.4. There is an algorithm such that given a context- 
free grammar G and a string w the algorithm determines the 
answer to the question "Is w in L(G)?" Thus, every context-free 
language is a recursive set. 
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In \\ context-free grammar G « (V, Z, P, S) it is quite common to 
have certain symbols, in> V - Z that arc "recursive"; for some 
7ii yi e V*, y$ 2 '¥* e . Z «§■ y^y 2 , If Z is such that for some non- 
empty-string y, Z £Zy, then a number of problems can arise. This 
situation is eliminated when attention is restricted to Greibach 
Normal Form grammars. 

A context-free grammar G = (V, Z, P,S) is in Greibach Normal 
Form if each production in P has one of the following forms: 

S~>e } ~. 

Z "^ a YaeZ, ZeV-Z, Y lf Y* e V - Z -{S}. 
Z^aYi [ 

Z^aYiYjJ 

Theorem 3.5. From a context-free grammar G! one can ef- 
fectively construct a context-free grammar G 2 in Greibach Normal 
Form such that L(G 2 ) = L(G t ). 

Notice that if G=(V, Z, P, S) is a Greibach Normal Form 
grammar, then a nonempty string w is in L(G) if and only if there is 
a derivation of w from S in G of length | w |. 

Given a grammar G ~ (V, Z, P, S) it may be the case that L(G) is 
empty or that certain symbols in V or rewriting rules in P'are never 
available for use, in derivations of strings in L(G) beginning with S. 
It is desirable that such symbols and rules be eliminated, and this 
can Be accomplished effectively. 

A context-free grammar G = (V, X, P, S) is reduced if either 
V = {S} and Z = P = 0 or (i) for each YeV there exist 
a, /?.e V* such that S^aY/? and (ii) for each ZeV-Z there 
exists y el* such that Z 5>y. 

* TraoRiEM 3.6. From a context-free grammar one can ef- 
fectively construct a reduced ^context-free grammar G 2 such that 
L(G 2 ) = L(Gi) and, if Gj is iirChomsky Normal Form (Greibach 
Normal Form), then so is G 2 . 

Corollary 3.7. There is an algorithm to determine whether 
L(G) = 0 for a context-free grammar G; that is, the emptiness 
problem for context-free grammars is decidable. 
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Consider an arbitrary context-free grammar G » (V, £, P, S). Let 
k be the number of symbols in V — 2. For any symbol Z 6 V — E< 
consider S(Z) = {y e V* | there is a derivation of y from Z of length 
at most k}. Now if Y e V is any symbol such that for some 

<x u a 2 6 V*, Z=>ociYa 2 , then there exist 1 (} u f} 2 e V* such that 
P1YP2 is in S(Z). This fact is quite Useful in proving Theorems 3.3, 
3.5, and 3.6 in that it provides a bound on 'the number of deri- 
vations to be considered when testing for certain conditions regard- 
ing the rewriting rules of the grammar. » 

33. There is a particularly useful result regarding the "structure" 
of context-free languages. Its proof depends on- simple properties of , 
derivation trees. 

Theorem' 3.8. Let L be a context-free language. There exist in- 
tegers p and q such that every ^string w e L satisfying | w| >p 
may be written as w = uvxyz with vy ^ e, | vxy| <> q, and 
{uy n xy n z|n £ 0} S L. 

Proof. Let G = (V, E, JP, S) be a Chomsky Normal Form gram- 
mar such that L(G) = L. Let k be the number of nonterminal 
symbols. 

Note that since G is in Chomsky Normal Form, if a derivation 
starting with any nonterminal symbol has a derivation tree with 
longest path of length t, then the length of the string generated is at 
most 2\ and if the string generated is in I*, then it has length at 
most?' 1 . . ' - ' 

Let p = 2 k ~ l and q = 2 k . Suppose that w e L(G) and | w| > p.. 
The longest path in the derivation tree of any derivation of w from 
S in G has length at least k + 1 and so has at least k + 2 nodes, 
k + 1 of which are labeled with nonterminal symbols. Thus at least 
two of these nodes are labeled with the same nonterminal symbol, 
say A 6 V — £ labels nodes n x and n 2 with n 2 closer to the leaf 
than n l and with the subpath rooted at n t having length at most 
k + 1. . - 

Consider the subtree with root n x . No path in this tree has 
length greater than k + 1 so that the terminal leaf string has length 
at me&t 2 k = q. Let x be the terminal leaf string generated by the 
subtree with root at n 2 . Let v And y be strings so that vxy is the 
terminal leaf string generated by the* subtree with root at n x . Thus, 
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| vxy | £ q. Since G is in Chomsky Normal Form and nj and n 2 arc 
in the same path with n t above n 2 , either v ^ c or y 9* c, 

Let u and z be strings so that w » uvxyz, By considering the 
portion of the derivation which does not Contain the subtree rooted 
at node n,, we have Ss£uAz. By cohsidering the portion of the 
derivation from node ni to node n 2 we have A«£vAy, and hence 
for any n > 0, A 4v n Ay n . By considering the portion of the deri- 
vation from node n 2 , We have A £ x, Thus, {uv n xy n z | n £ 0} £ L, 

a a 

Corollary 3.9. For each context-free grarrfmar, G, there is an 
integer k > 1 such that L(G) is infinite if and only "if L(G) contains 
a string w such that k £ |w| < 2k. Hence there is an algorithm to 
determine whether a context-free grammar generates only a finite 
language, that is, the finiteness problem for context-free grammars 
is dccidable. 

Theorem 3.8 is known as the "pumping lemma" for context-free 
languages and is an example of an "intercalation" theorem, A 
stronger intercalation theorem for context-free languages is known. 

For any set A of symbols, any w e A* such that w ^ e, and any 
integer i such that 1 £ i £ | w |, the symbol a occurs in the ith posi- 
tion o/w if w - yiay 2 and \y x | = i - 1. 

Theorem 3.10. For each cofttext-free grammar G = (V, I, P, S), 
there'is an integeH^ 1 such that for any w € L(G) with | w | £ k, if 
any k or more distinct positions in w are designated as dis- 
tinguished, then there exist Z e V - E and u, v, x, y, z 6 S* such 
that each of the following conditions is fulfilled : ^ 

(i) S uZz, Z £ vZy, and w = uvxyz; ^ 

(ii) x contains at least one of the distinguished positions of w ; 

(iii) Either both u and v contain distinguished positions, of both * 
y and z contain distinguished positions; 

(iv) vxy contains at most k distinguished positions. 

These results can be used to show that certain languages are not 
context-free. For example, none of {a n b n c n J n > 0}, {a n bVd m |n, 
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m > 0}, or {wcw*cw|w e (a, b}*} is context-free but each cun be" 
expressed as the intersection of two context-free languages. 

- Corollary 3.11. The class of context-free languages is not 
closed under intersection. 

4 

It should be clear that for an arbitrary context-free grammar O 
and an arbitrary string w in L(G), there may be- more than one 
derivation of w in G or even more than one left-to-right derivation. 
Thus we are interested in" the "ambiguity' 1 of w in G. 

Let G = (V, I, P, S) be a context-free grammar. A string 
w e L(G) is ambiguous in G if there exist two distinct left-to-right 
derivations of w from S in G. The grammar G is an ambiguous 
context-free grammar if there exists a string in L(G) that is ambigu- 
ous in G; otherwise, G is unambiguous. A context-free language L is 
an inherently ambiguous context-free language if for every context - 
fre* grammar G with L(G) = L, G is ambiguous; otherwise, L is 
unambiguous. 4 

The following result can be established by using Theorem 3.10. 

Theorem 3.12. There exist inherently ambiguous context-free 
languages, e.g., {a 1 b J c k | i = j or j = k). . 

3A At this point the reader will note that the only method of 
showing that a language is context-free is to exhibit a context-free 
grammar and to show that the grammar generates the language. 
To provide another method of showing that a language is context- 
free as well as to enrich our understanding of this class of 
languages, we consider closure properties of this class. . 

Theorem 3.13. The class of all context-free languages is closed 
under each of the following operations: union, concatenation, 
Kleene * , intersection with regular sets, inverse homomorphism, 
reversal, substitution, and arbitrary homomorphic mappings. 

' The operations given in Theorem 3.13 are not independent; for 
example, if L is any class of languages that contains the regular sfcts 
' and is closed under substitution, then L is closed under unidn, 
concatenation, Kleene * , and arbitrary homomorphic mappings. 
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Further, these operations do not characterize "context-freap-ness." 

However, some of these operations can be used to provide such a 

characterization. 

For ayy n ^ l, Iet A n be a set of 2n distinct letters, A n = {aj, 
: ^hi *U , Let ~ be the congruence on A* determined by 

defining^ a^ ^ e for eaqji i = 1, . . . , h. The Dyck set D n on n letters 

Is the set {w e &* | w ~ e}. 

Generalizing from Dj and D 2 in Examples 3: l(b)^mdl 3.1(d), it i§» 

clear th^t for evgry n, D n is a context-free language. For any n ^ 1, 
* any two Dyck sets on n letters are isomorphic as subsemigroups of 

free semigroups so that one refers to the Dyck set on n letters. 

Intuitively, D n is" the set of balanced nested strings of matching 

parentheses of n types. ^ / _ 

° From the Dyck sets we obtain" a characterization of the 
.context-free languages. This result is a version of the 
"Chomsky-Schutzenberger Theorem." It represents an important 
theme in the ma thematical theory of formal languages. ; 

Theorem 3.14. For each context-free language L there is a 
regular, set R, a riqnerasihg homombrphism h^ and ^ homomor- 
phism h 2 such that L = h^h^ r (D 2 ) n R), where D 2 is the Dyck set 
•on two letters. ■ ^ 

Proof. We -shall sketch the construction of a tegular set R and a 
homomorphism h x such that hi(D t O -R) ~ I* Vyhere t is a con- 
stant that depends on a grammar generating L; if A t <== {a ls . . . , a t , 
a ls aj and A 2 = {a 1 , a 2 , a l9 a 2 ), then the homomorphism h 2 : 
' A t *-> AJ determined by defining h 2 (ai) =Vi&2 anjd h 2 (ai) = a 2 ai 
for every i has the property that h 2 ^(Di) = D t . Hence, 

Let G = (V, Z, P, S) be a Greibach Normal Form grammaf such 
'that L(G) = L - {e}. For each symbol Z e V,: let 2 te^a^riew 
' symbol and let A t = V u {2| Z,e V}, so that Vis the number of 
symbols in V. Let p and q be two. new symbols, pi q £ A t . Let 
G 0 = ({p, q} u A' t , A t , Pq, p\ be the Jeff linear grammar obtained 
by defining P 0 as follows: 

' (iV p'U Sq is in P 0 ^ , \ . /'• 0 

(iij for^ach^Z e V — 5 Z, a e Z such thatZA^is in P, q— > ; aaZq 
/isinR 
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(iii) for each Z, Y e V - Z, a e Z sucji that Z 
. q~> aaZYq is in P 0 ; 

(iv) for each Z, Y^' Y 2 e V - Z, a e Z sucfi ttxatr: 
P, q 2 ^ aaZY 2 Y x q is in P 0 ; 

(v) q-* e is in P 0 . 



. Let R = L(G 0 )^so that G 0 being a left-linear grammar implies 
that R is a regular set. Let h x : Af-^.Z* be the homomor- 
phism determined by defining h x (a) == a for a e Z and h x (a) = 
h^Z) = h^Z) = e for a e Z, Z e V — . Z. By corisidering left-to^ht 
derivations in G, one can show that h!(D t n R) =-L — {e} .and"if 
e e L, then h 1 (D,' n R') = L where R' = R 0 {e}. By using a tech- 
nical variation on this construction, h r can be made nonerasing. □ m 



Since each Dyck set is a context-free] language* and the class of 
context-free languages is closed under inverse homomorphism, 
homomorphic mappings, and intersection with regular sets, the fol- 
lowing result is immediate. ' >^ 

r - it ■ 

Corollary 3.15. The class of context-free language^ is the smal- 
lest class containing D 2 and closed under homomorphism, inverse^ 
homomorphism, and intersection with regular sets. j 



It should be noted that in Theorem 3.14, if D 2 is replaced by D x , 
then we cannot obtain all the context-free languages, i . 



3.5. There is a Result of mathematical interest regarding the 
number 6f occurrences of individual letters in the strings making 
up a context-free language. A . : 

For any kXQ, let N (k) be the set of all k-tuples df natural num^ 
bers, so that N (k M^elo§ed under addition by coordinates and under 
scalar multiplication. P / 

A subset Q of N (k) is linear if there exist a, ji X9 . . . , /? m e N (k) such 
that Q = fa + nxPi + ; ■ ■ + n^/y n, e N j, and a subset of N (k) is 
semi-linear if it is a finite union of linear sets/ v 

Let Z be a finite alphabet, say Z = (a, jl ^-ik'kf. The Parikh 
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mapping ^ k of I* : onto*N (k) is defined: as follows; 

^k(e) = (0,..,CD;: I j , % 

^ k (aj) = (Z n , . . Z ik ) where Z,j = 0 for j ^ i and Z H = 1 ; 

^k(bi b„) = X ^k(bj)> for n^;i, each bjeZ. 

Trr^'""'"''''':-." - ■ , :,' ... . • 

Theorem 3.16. If L is a context-free language and LcZ* where 
I is a finite alphabet, then the image of L under the Parikh map- 
ping is a seifii-Iinear set. 

Of .course there exist languages that are not context-free but 
do have a semi-linear image under the Parikh mapping, e.g., 
{a n b n c n |n£ 1}. 

3.6. It has been nbtedf that certain questions about context-free 
grammars are decidable, e.g;, . the question "For a context-free 
grammar G, is L(G) empty?" However, some other important 
questions are undecidable. To show that a question about context- 
free grammars is undecidable, one of two basic techniques is us- 
ually employed: reduction to the halting problem (or some other 
undecidable question) for Turing machines or reduction to the Cor- 
respondence Problem. , ! 

^Let us consider what is involved iri reducing a questipn . about 
context-free grammars to the halting problem for Turing machines. 
We begin by considering a Turing machiae M which has one tape 
and one read-write head that operates on that tape. Without loss of 
generality, assume that the computation of machine M on an input 
string w halts if and only if M accepts w. A finite computation of M 
on input w can be represented by a sequence of "instantaneous 
descriptions," strings that describe MY tape contents and finite- 
state control. Such a sequence ID 0 , ID X , .... ID n 'of instantaneous 
descriptions represent an accepting computation of M on w if ID 0 ' 
represents M's initial configuration on input w, ID n represents an 
accepting (halting) configuration of M^ and for each j = 1, n, 
IDj represents M's tape contents and finite-state control after the 
transition function acts on the instantaneous description IDj-i- 
Let A be the alphabet containing all the! symbols used by M as well 
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as symbols representing the states in M's finite state control. Let # 
be a symbol not in A. For each input string w, one cart construct a 
context-free grammar G (depending only on M and w) such that 
L(G) is the set of all strings y in (A u {#})* such that y is not 
# ID 0 # ID !#...# ID n # where ID 0 , ID lf . . . , ID n represents an 
accepting computation of M on input w. Thus .each y in L(G) either 
is not of the form of such an encoding of an accepting computation 
or is of the form of such an encoding but contains a "mistake"; that 
is, if y = # x 0 # X! # ... # x n # where each x, is a string in A* that 
encodes a, configuration of M, then either x 0 is not of the form of 
an initial configuration of M on w or x n is not of the fo^m of an 
accepting configuration of M or for some j, 1 ^ j ^ n, the string Xj 
is not of the form of the configuration obtained by applying M's 
transition function to the configuration represented by x^. Due 
to our assumptions concerning M the computation of M on w 
halts if and only if M accepts w. Thus, M accepts w if and only if 
- L(G) =,(A U {#})*. Now (A u {# })* is a regular set and hence 
L(G) is a context-free language. Thus the question "For context- 
free grammars G t and G 2 , is I^G^ equal to L(G 2 )?" is undecid- 
able, for otherwise the halting problem for Turing^machines would 
be d^c^Ie. 

' ■ • . .. ' ■ • ." 

Theorem 3.17. The equivalence problem for context-free gram- 
mars is undecidable. 

■ ^ • >■ ' ■ ;- ' ' 

Variations on the technique described above can be used to 
obtain the following results. 

Theorem 3.18. Each of the following questions is undecidable: 

(a) For a context-free grammar Gj is L(G) regular? - 

(b) For a context-free grammar G and a regular set R, is 
L(G) = R? ' . • V 

(c) For a context-free grammar G and a finite alphabet Z, is 
Z* - L(G) empty? 

. 1 (d) For axontext-free grammar G, is L(G) co-finite? 
(e) For a context-free grammar G, is G ambiguous? I 
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(0 For a context-free grammar G, is L(G) inherently ambigu- 
ous? ; 

(g) For context-free grammars Gj and G 2 , is L(Gj) n L(G 2 ) 
empt^? 

(h) For' context-free grammars G r and Gi, is HG X ) n L(G 2 )* 
finite? 

(i) For context-free grammars G x and G 2 \ is MGj) £ L(G 2 )? 

3.7. The concept of a context-free grammar and the language it 
generates originates with Chomsky [19J|-[25], who attempted to 
develop a reasonable mathematical model for the description of 
natural language. The theory developed initially through the work I 
of Chomsky and of Bar-Hillel [7]-[9]. Around 1J560 it was dis- 
covered that the. formal description languages used by Backus to 
specify certain aspects of programming languages were precisely 
the context-free languages. What has become known as Backus- 
Naur Form (BNF) was used in the syntactic definition of the 
language ALGOL 60 [68]. 

The textbook by Ginsburg [37] is a fairly complete treatment of 
the theory of context-free languages circa 1966. The books by Salo- 
maa [82] and by Hopcroft and Ullman [58] discuss the class of 
context-free languages as well as other classes of languages. Th6 
textbooks by Aho and Ullman [3], [94] and by Lewis, Ro- 
senkrantz, and Stearns [61] describe those aspects of the theory of 
context-free languages that are of great use in compiler design. 

There are several papers wfiiclTare particularly useful in tracing 
the development of the theory^l&cdQtext-free " languages. These 
papers contain some of the resuISPnqted in this section. Theorem 
3.5 is due to Greibach [45], Theorem 3.8 to Bar-Hillel, Perles, and 
Shamir [9], Theorem 3.10 ^o Ogden [71], Theorem 3.14 to 
Chomsky and Schutzenberger [25], and Theorem 3.16 to Parikh 
[72]. A different viewpbint of the specification of. context-free 
languages is due to Nivat [69]. 

4. PUSHDOWN STORE ACCEPTORS ■ 

Two characterizations of the context-free languages were es- 
tablished in the last section: First, a language was defined to be 
context-free it was generated by a context-free grammar; second, 
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it was shown that a language is context-free if and only if it can be 
represented as ^ hj^hzH^J ^R) where h x and. h 2 are homomor- 
phisms, D 2 is the Dyck set on two letters, and R is a regular set. In 
this section we provide a characterization in terms of a class of 
abstract automata, the "pushdown store acceptors." 

A "pushdown store acceptor" is an automaton with an input; 
tape, a finite set of states, and a last-in first-out data structure: a 
pushdown store. A pushdown store may be regarded as a one-way 
infinite tape whose contents can be changed only at one end, the 
"top." The information obtained from the pushdown store in a 
single step is the top symbol on the tape. Reading a symbol from 
the pushdown store automatically erases ("pbjps?')' it from the tape. 
Symbols can be added only at the top and only a bounded ifumber 
can be added ("pushed down"). in\ny single step. The acceptors 
have an input tape with a head that moves across the tape from left 
to right, reading the contents of successive tape squares but neycv 
writing. Depending on the current state, the input symbol being 
scanned, and the symbol on the top of the pushdown store, the 
transition function determines whether or not a new input symbol 
is. to be read, what the next state is to be, and how the pushdown 
store is to be altered. 

Pushdown store acceptors may be viewed as extensions of finite- 
state acceptors: add. a pushdown store as auxiliary' storage to a 
finite-state acceptor. However, there are several other differences. 

A finite-state acceptor reads input from left to right and reads a 
new input symbol at every step. Any abstract automaton (or 
Turing machine) with this property is said to operate in real time. 
Notice that if an ^itomaton operates in real time, then a compu- 
tation on an input string of length n has at most n steps. A push- 
down store acceptor need not read a new input at each step; it may 
perform a sequence of transitions that only change state and ma- 
nipulate the pushdown store. It can be shown that for each push- 
down store acceptor there is a constant k such that a computation 
on an input string of length n has at most kri steps. Any abstract 
automaton (or Turing machine) with this property is said to oper- 
ate in linear time. 

. There is an extremely important difference between the way that 
a finite-state acceptor, operates and the way that a pushdown store 
acceptor operates. Invthe definition of finite-state acceptor (and of 
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Turing machines) given in Section 2, aKeach step of the compu- 
tation 1 there is exactly one transition that\the machine can make. 
This x means that the machine" is "deterministic.""Howeverr-when- 
studying abstract automata, one considers the "nondeterministic" 
lode of operation in which there is a finite number of possible 
trahshions and the automaton must "guess" the correct choice of 
transitibqs. In this way, nondetenninistic automata do not faith- 
fully modSkthe behavior of actual computing machines but do 
form a mathematical construct that plays an important role in the 
study of automata and formal languages and of computational 
complexity. In particular, nondeterministic p^ishdgwn store accept- 
ors characterize the clasiKof context-free languages while the deter- 
ministic pushdown store acfcoptors do not. I 

itions by defining deterministic 




v 4.1. W$ begin^ the formal" de 
pushdown storie acceptors. 



A deterministic 
(see Figure 7) hia^ 

pushdown alphabet T, a (partial) transition 

8:K x(I u {e})x(ru {e}) 



pushdown store acceptor*® == (K, Z, F, <5, q 0 , F) 
s a finite set K of states; art^nput alphabet X, a 

function 



K x f* 



an initial state q 0 \e K, and a set F of accepting states. Th^transi- 
tion function is /restricted so that for each qe^C and Z 6 fNj {fe}, 
either (a) 5(q, e, Z)Ms undefined and for each! a 6 1, <5(q, a, Z^is 
defined, or (b) &(q, e, Z) is defined and for each a e E, <5(q, a, Z) ft 
undefined. r V\ tt -j 

In the definition of a deterministic pushdowh store acceptor, the 
transition function is definedxin such a wajj that, based on the 
current state and the current "top" of the pushdown store, either a 
new input/symbol is read (part (k)\ or a transition involving only 
change of/state and manipulation of the pushdown store is specified 
(part (b% This is ^n essential feature otthc definition of a determin- 
istic machine. \>' 

Again note that the definition specifies a pushdown store ac- 
ceptor as a static object. To explain the dynamics of how such an 
acceptor computes is to explain how t£e instructions encoded in 
the transition function are applied to the input tape and the push- 
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rio, 7. A pushdown store acceptor. 

down store. With a finite-state acceptor it was sufficient to extenjj, 
the transition function to input strings instead of individual input 
symbols^ut here it is necessary to define "instantaneous descrip- 
tions" and a "yield" relation between instantaneous descriptions. 
xlf;Df= (K, T, <5, q 0 , F) is a pushdown store acceptor, then an 
^instantaneous description of D is an element of K x E* x r*. An 
initial instantaneous description is^any element of {q 0 } x 2* x {e}. 
Define ji binary relation h (read: "yields" or "yields in one step") 
on the set of instantaneous .descriptions of D as follows: for p, 

'. ^ : i 4 ^Jt 
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qeK, a'elu {e}, u, v e T* Ze^S weZVif 5(q, a, 
Z) = (p, u)j then (q, aw, Zy) h (p, w, uv). I^t I* denote jtji^transi- 
tive reflexive closure of K 

An instantaneous description (q, w, y) is interpreted as a push- 
down store acceptor configuration with current state q, input string 
w, and pushdown store contents y; w represents the input remain- 
ing to be processed, and if w ^ e, then the leftmost symbol is the 
currently scanned input symbol; y represents^the current pushdown 
store contents, and if y^e, then the leftmost symbol of y 
is the symbol on the "top" of the pushdown store. If (q, aw, Zy) h 
(p, w, uy), then ffom state q, whilje reading a e'Z O {e} (i.e., either 
reading a e I as input or* ignoring the input if a — e) with 
Z e T u {e} on the topf of the pushdown store (i.e., the pushdown 
store is riot empty and the symbol Z e r is the contents of the top 
tape square or the pushdown store-is empty alid Z = e), the accept- 
or goes to«tate p and replaces Z on the top of the pushdown store 
with u. If u = e, then this transition "pops" Z from the .pushdown 
store; if u e, then this transition erases Z and writes u in place of 
Z, "pushing down" the rightmost |u| — 1 symbols of u into the 
pushdown store (one symbol per tape square) so that the leftmost 
symbol (top) of the store contains the leftmost symbol of u. 

There are three notions of "acceptance" by a pushdown store 
acceptor D=r(K, I, JT, <5, q 0 , F). Define L(D) = {w e S* | there 
exists q e F such that (q 0 , w, e) I* (q$ e, e)}, T(D) ={weZ*| there 
exist q e F and ueP such that (q 0 , w, e) £ (q, e, u)}, and 
N(D) =_{w e £ * I ^ere exist q e K such that (q 0 , w, e) £ ( (q, e, e)}. 

For a pushdown store acceptor D, L(D ) represents "acceptance 
by final state, and ertipty store," T(D) represents "acceptance by 
final state," a a nd N(D) represents "acceptance by empty store." N 

It Js easy to show that for any deterministic pushdown store 
acceptor D l5 *one can construct a deterministic pushdown store 
acceptor p 2 with the property that L(D X ) = N(D 2 ) = T(D 2 ),' and 
that for such a D l one can construct a D 3 with the property that 
N(D X ) = L(D 3 ) =^T(D 3 ). However, the language L = {a}* u {a n b m 
|n > m > 0} is such that there is a deterministic pushdown store 
-acceptor D\ such that TfD^ — L but there is no acceptor D 2 such 
'that N(D 2 ) = L or L(D 2 ) ;= L. Thus, in order to describe the largest 
class of languages as "deterministic context-free," we choose "ac- 
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ceptance by- final state" as the method of acceptance to specify 
these languages. 

A language L is deterministic context-free if there js a determini- 
stic pushdown store acceptor D such that T(D) = L. 

At this point we have not justified the use of the term "MeterminN 
stic context-free" since we have not shown that T(D) is a contexts 
free language when D is a deterministic pushdown store acceptor. 
However, this wilf be an immediate corollary of the 
characterization of the context-free languages as the languages ac- 
cepted by nondeterministic pushdown store acceptors. 

Examples 4.1. 

(a) Let E ==? {a, b, c} and L = {wcw R |w e {a, b}*}. For K = 
{q 0 , q lf T = {a, b}, F = {q 2 }, and the transition function <5 
given below, D = (K, S, T, <5, q 0 , F) is a deterministic pushdown 
store acceptor such that T(D) = L. 

< <5(Qo>a, e) =(q 0 ,a) 
> Y' <5(q<i, b, e) =(q 0 ,b) 
.' <5(q 0 »c, e) = (q t , e) 
<5(q 0 ,a, a) =?(qb',.aa) 
<5(q 0 » a, b) = (q 0 , ab) 
<5(q 0 ,b, a) = (q 0 /ba) 
V; ^ <5(q 0 ,b, b) - (q 0 , bb) 

''hrA^MM Cqi» b) 

<5(qi, ; b, b) = (q lf e)> 
<5(qi, e, e)* = (q 2 , e) 

From the initial state q 0 , D reads a string w e {a, b}* and .copies 
it onto the pushdqwn store. When D first reads c, D transfers into 
state q t and then attempts to match the remaining input with the 
contents ofihe pushdown store. The input is accepted if and pnly if 
the computation ends in state q 2 , and D moves into statefl 2 if and 
only if D is in state q t with the pushdown store empty (sp thiat the 
input read-before the c is the reversal of the input read after the c). 

(b) Let 2 = {(,), [,]} and let L^E* be the Dyck set on two 
letters (see Example 3.1(d)). For K = {q 0 , qj, F = {(, [},F == {q 0 }, 
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and the transition function given below, D> (K, 2* T, <5, q 0 , «F)/fs a 

deterministic pushdown store acceptor such that T(D) = L. ' *" 

."'">» 

f <5(q 0 ,(, e) =(q 1? () , 
^(qo»C»e)=(q 1 , [) - 

t() =(qi>(() 
■tfqi.DO =(qi.«) . . ;■ 

=(qi>(0 ' 



\ , t ^qi>CC)=(qi.CC) 
\ <5(qi>)> () =(qi,e) • ( 

\ *qi,M)=(qi,e) 
\ ' 4(qi, e, e) = (q 0 ,e) 

Deterministic pushdown store acceptors "can be made to halt," 
that is, for every such D^ne can construct D 2 such that T(D 2 ) = 
V^l^D.iX'&nd on every input string D 2 's computation halts. Thus one 
^ cari «show that the complement of a deterministic context-free 
language is also a deterministic context-free language. This is not a 
property of the class of all ^context-free languages,. for otherwise the 
class of' context-free languages would be closed under intersection. 
Hence there are context-free languages ttiat are not deterministic 
context-free; for. example, Lj = {ww R | w ^ {a, b}*} and. L 2 = v 
{a p b q c r | p £ q or q ^ r}. Finally ffcte that deterministic pushdown 
store acceptors "can be made to operate in linear time" but that the 
language '{w^Wj # ^ #^^ n ^'#wf |n > 1, 0°^ i £ n - 1, 
each Wj e {a, ty}*} isva d^^^^^gcontext-free language that 
cannot be accepted in rearfin^^^^^^rministic pushdown store 
acceptor (or multitape Turing machine). 

4.2.- Now we turn to the study of nondeterministic pushdown 
< store acceptors. ' 

A nondeterministic pushdown store acceptor 

, D = (K,2, F,<5, q 0 ,F) ; 

has a finite set K of states, an input alphabet I, and pushdown 
- alphabet T, a transition function 8: K x (E u {e}) x (r u>{e})-> 
(finite subsets of K x T*), an initial. state q 0 , and a set F of accept- 
ing states; 

The notion of instantaneous description is defined just as in the 
deterministic case. For a nondeterministic ^pushdown store acceptor 
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D = (K, I, T, <5, q 0 , F), defines binary relation h on the set of 
instantaneous descriptions of D as follows: for p, q e K, 
aelu {e}, ZeTu {e}, u, veP.we I*, if'(p, u) e <5(q, a, Z) 
then (q, aw, Zv) h (p, w, uv). Let h* denote the transitive reflexive^ 
closure of K v 

The definition of a nondeterministic pushdown store acceptor 
differs from that of a deterministic acceptor in that the transition 
function specifies a finite set of possible next moves. To represent 
all possible computations of a nondeterministic pushdown store 
acceptor on a given input string, a "computation tree" of instanta- 
neous descriptions is use*. A single computation is represented by 
a path in this tree starting at the root and ending at a leaf. 

The three definitions of acceptance given for deterministic push- 
down store acceptors are vaUd for nondeterministic pushdown 
store acceptors. However, in the pase of the nondeterministic 
model, the three methods of acceptance are of equal power: 
{L(D)|D is a ^nondeterministic pushdown store acceptor} = 
{T(D)|D is a nondeterministic pushdown store acceptor} = 
{N(D) | D is a nondeterministic pushdown store acceptor}. 

It must be emphasized that a nondeterministic pushdown store 
acceptor D accepts a given input string w if and only if there exists 
an accepting computation in the computation tree of D and w. 
Some computations of D on w may not end, in an accepting con- 
figuration even though D accepts w. For D to reject w, all compu- 
tations of D on w must end in nonaccepting computations. Gener- 
ally, a nondeterministic pushdown ^tore acceptor" D cannot tell 
whether it cejects an input string w; to determine this,' one must 
"deterministically simulate" all of D's computations, and construct 
the computation tree for D and w. \ 

Examples 4.2. , , ^ 

(a) Let 2 = {a, b} and L = {ww R |w e {aTb}*}. For *L = {q 0 , qi, 
q 2 }. T= {a v b}, F = {q 2 }, and the transition function <5 given 
below, D = (K, I, T, <5, q 0 , F) is a nondeterministic pushdown 
store acceptor such that T(D) = L(D) = N(D) = L. 

; <5(q 0 , a, e) = {(q 0 , a)} . . . 

<5(q 0 ,b, e) = {(qo,b)} 
<5(q 0 ,a, a) = {(q 0 ,aa), (q lf e)} 
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<5(q 0) a, b) = {(q 0 , ab)} 
<5(q 0) b, a) = {(q 0 , ba)} " 
<5(qo,b,b) = {(q 0> bb) > (q 1> e)} 
. <5( qi> b,b) ={(ql,e)} ' ' . . " ' 
' , ^(qi.*a, a) = {(qLe)} — 
<5(q„;e,e) ={(q 2 Xe)} -. 

D reads input symbols from Z and writes these symbols on the 
pushdown store_ while in state q 0 . The onry^use of nondeterminism 
is the "guess" that half the input has been reaihand so it is time to 
transfer control to state q, and match the remaining input with the 
contents of the pushdown store. 

(b) Let Z = {a, b, c}. and L = {aPbV | p # q or q # r}. For K = 

{qi I > = 0 9}, F = {1}, F = {q 5 }, and the transition function 8 

given below, D = (K, Zyr, <5, q 0 , F) is a nondeterministitf push- 
down store acceptor such that T(D) = L(D) =,L. 



<5(qq> 


e, e) 


= {(Si. e )> (qe. e)} 


<5(qi. 


a, e) 


= {(qi. 1)} 


<5(qj. 


a, 1) 


= {(Qi, 11)} 


<5(qi. 


b, 1) 


= {(q 2 >e)} 


«5(q 2 >b, 1) 


- {(q 2 > e)} 


: <5(q 2 . 


b, e) 


= {(q 3 ,e)} 


<5(q 2 . 


Cl) 


= {(q4.e)} 


<5(q 3 > 


b,e) 


= {(q 3 ,,e)} 


«5(q 3 , 


c', e) 


= {(q 4 < e)} . 




c, e) 


= {(q4.e)} 


<5(q*> 


c, 1> 


= {(q*. e)} ■> . 


<5(q*. 


e, e) 


= {(q s . e)} 


<5(q*, 


e, 1) 


= {(q4.e)} 


<5(q 6 . 


a, e) 


= {(q 6 ,e)} 


<5(q?, 


b, e) 


= {(q 7 >'D} 


<5(q 7 > 


b, 1) 


= {(q 7 .-ll)} 


' <5(q 7 » 


c, 1) 


= {(q 8 .e)} 


<5(q 8 . 


c'D 


= {(q 8 .e)} 


<5(q 8 > 


c, e) 


-•{(q*,e)} 


<5(q 8 > 


e, 1) 


= {(q 9 ,e)} 


<5(q 9 > 


e,l) 


= {(q9.e)} 


\ <5(q 9 > 


e, e) 


= {(q s :e)} > 
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In thfs case D initially guesses that p> q (by going into state 
or that q £ r (by going into state q 6 ). Once this guess is made, D , 
simply checks to see whether the guess is correct. ■ 

• Now we sketch the proof of the characterization of the context- 
free languages as the languages accepted by the nondeterniinistic 
pushdown store acceptors. 

Recall that a language L is context-free if and only if there is a 
Greibach Normal Form grammar* G = (V, S, P, S) such that 
L(G) = L. From G construct a nondeterministic pushdown store 
acceptor D = (K, E, T, d, q 0 , F 4 ) as follows: K =? {q 0 , q lf q 2 }, 
T ^=*{$}'u (V - S) where $ is a symbol not in V — E, F = {q 2 }\ 
and 5 is given by (iH»i) : - x 

W mo . c, ej - § ^ otherwise. • 

(ii) For each a e £ and Z e V - E, <5(q!, a, Z) = {(q u y)|y e 
(V - SXV - S) u (V - S) u {e} and Z-* ay is in P}. 

(iii) ^(q 1 , e,» = {(c l? ,e)}. 

Recall that w is in L(G) if t and only' if there is a left-to-right 
derivation of w from S in G. The nondeterministic pushdoyvn store 
acceptor D is constructed so that its accepting computations simu- 
late the left-to : right derivations of G. At any point in a compu- 
tation, the symbol on the top of the pushdown store corresponds to 
the leftmost nonterminal symbol in the string in the corresponding 

derivation. . b " ■ . ■ , 
"1 This constructio A yields the following result. 

Theorem 4.3.»For every context-free grammar G, one can con- 
struct a nondeterministic pushdown store acceptor, J5* such that 
L(D) = T(D) = N(D) = L(G). • 

In* the construction of the nondeterministic pushdown stpre ac- 
ceptor D from the Greibach Normal Form /grammar G, the use of 
the symbol $ as an "endmarker" for the pushdown store allows one 
to claim that L(D) = T(D) = N(D).,Thus in the next theorem we 
shall be concerned only with L(D). . 

We wish to show that if D = (K v S, T, 8, q^, F) is a nondetermin- 
istic pushdown store acceptor, then L(D) is a context-free language. 
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To accomplish this we- show, how to construct a context-free gram- 
mar G so that the left-tb-right derivations of strings in L(G) corre- 
spond to the accepting computations of D. 

Let A = (K x F x K) u (ICx {e} x K). We vh?w the elements of 
A as individual symbols and we assume' that Z n A = 0. Let S be 
a new symbol, S ^'Z u A, and let V = E u A u {S}. : Define the set 
P of productions of G as follows: „'.v: 

■ - t. ■ ■ ■ . . ' . s • \. 

"■■ (U {S-> (q 0 , e, p)| p e F} S P; . 

(ii) For every, choice q, q lf q t+4 e K, ae Z u^e},' . 
Z e'r u je}, Yi, ...,Y t er, t £ 1 such that 5(q, a, Z)Wn- . 
.tains (^ Y'i ... Y t ) where p = q l+1 , (q, Z, p)->a(q lf . Yi; 
^2)(q2 > Yi, q 3 )\- -(q t ,Y t ,q t+1 )is inP; 

(iii) For every q, p e K, a el u {e}^ Ze ro {e} juch that. 
, 5(q, a, Z) contains (p, e), (q, Z,p)->a is in P; 

(iv) For every p e F, (p, e, p)-r> e is in P. 

The symbols (p, Z, q) "encode" three pieces of information. The * 
first coordinate represents the current state of a computation of D.; 
The second cooilfinate represents the symbol on the top of D's 
pushdown store -if the sto^e is not empty (Z =£ e) or is the empty 

, word e if the J.st6re is empty. The third coordinate represents a 

*"giiess" of the state P will have reached when, in a subcomputation 
starting in state p With Z on the top of'the store, this square is 

' emptied for the first tijne. • • • • - 

. The productions in P allow for the simulation by a left-to-right 
derivation in G of a computation of D. By induction on the length 
n of the computation, it can be shown that for any w e E* q e K, 
and Y lf . .., Y t e F, (q 0 ; w, e) h* (q, e, Y x ... YJ is a computation 
of length n in D if and only if fof; every choice. qi, qt+i e K 
with q x = q, (q 6 ye, q t + i)^ w(q x , Y x , q 2 )... . (q^ Y t , q^x) is a left- 
to-right derivation of length n in G. Similarly, for any w e l* and 
p e F, (q 0 , w, e) h* (p, e, e) is a computation in D if and only if S 
=> (q 0 , e, p) ^> w(p, e, p) => w is a left-to-right derivation in G. 

* a This construction leads to the following result. ; 

Theorem 4.4. If P. is* a nondeteftriinistic pushdown- store ac- 
ceptor, then from P one can: construct a context-free grammar G 
such that L(G) = L(D). 
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From Theorems 4.3 and 4.4 we haye our characterization of the 
context-fr^e languages as the languages accepted by nondetermin- 
• istic pushdown store acceptors. Further, the constructive inter- 
change between grammars and acceptors preserves the decidability v 
and undecidability. of questions about context-free languages 
whether these languages are specified by grammars or by acceptors. '. 

4,3, The importance, of. the pushdown, store as a data structure 
** useful in natural language^fdcessing was recognized very early (see 
Kunp [60]). Papers by Chomsky [22], Schutzejtiberger [85], and 
Evey [32] dlpbnstrated the usefulness of the formal model in 
. studying context-free languages. Hartmanis [52] used pushdown 
store acceptors to show- the undecidability of certain questions 
about context-free languages. Ginsburg and Greibach [39] did a 
careful analysis of deterministic pushdown store acceptors. 

The use of a pushdown store when studying multitape Turing 
machines is illustrated in Book and Greibach 

Again, the text of Ginsburg [37] is a source bringing together 
much of the early results on the study of context-free languages by, 
" means of pushdown store. acceptors. The texts by Aho and Ullman 
- ;[3], [94] and by Lewis, Rosenktantz, apd Stearns [61] and the 
dissertation by Brosgol [18] illustrate how deterministic pushdown 
store machines can be used in compiler construction. 

The languages accepted by deterministic pushdown store ac- 
ceptors are deterministic context-free languages.' Grammars to gen- 

' .' ecate these languages are the LR(k) grammars of Knuth [95]. 

■ • . <* . ■ ■ 

^ 5. PARSING ' 

, As noted in Section 4, the class of context-free languages is pre- 
cisely the class of languages accepted by nondeterministic push- 
down; store acceptors. For applications in programming and com- 
. piling, it is necessary to have an = algorithm for recognition and 
* parsing of context-free grammars/Not only must one be able to 
» determine whether or not a given string is. generated by a given 
context-free grammar (recognition) but also when the string is so 
generated it is necessary to find a derivation tree for it (parsing). 
Clearly, these tasks can be performed by transforming the given 
grammar into some kind of standard, form, say Chomsky Normal 

\* Z\ 
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Form, and then enumerating all derivations from the initial symbol 
of a certairi, length (in the case of Chomsky Normal Form, deri* 
vations of length 2 |w|— 1 where w is the string in question). 
However, it is qyite clear that such a procedure is hopelessly time- 
, consuming and it is necessary to find efficient techniques for re- 
cognition and parsing. » v 

A great deal of effort has been expended in the development and 
analysis of algorithms for context-free recognition and parsing. In 
this^ section, one recognition algorithm is sketched. 

Let G = (V, 2, P, S) be a context-free grammar^ in Chomsky 
Normal Form. Recall that e e L(G) if and only if s Ve is a pro- 
duction in P, and that all other rules are of the form Z-> a or 
Z-> Y X Y 2 with a e S, Z e V ~ E, and Y u Y 2 e V - 2 - {S}. 
Given a nonempty string w e Z* say w = a x ... a n , a "recognition 
matrix" for w is as follows: If V ~ £ contains q symbols/ say 
' V- 2 = ;{Z lf Z 2 , ... ,Z q } with Z x > S, then let 

T=:[t(i,j,k)] iJ = J, kas 3 V 

be the three-dimensional binary matrix defined by t(i, j, k) = 1 if 
and only if the substring aj ... aj'+^j can be generated from Z k . In 
such a recognition matrix, t(n, I, 1) = 1 if and only if a, .. a e 
L(G). 

A recognition algorithm will result from an algorithm to con- 
struct the recognition matrix T given a x . . . a n and G, Let us consi- 
der a simple iterative algorithm that constructs T. First, note that 
the i = 1 plane is easily constructed by inspecting the set P of 
productions since t(l, j\ k) = 1 if and only if Z k -> aj is a production 
in P. Second, note that all other entries are obtained by considering 
those productions in P of the form A ~> BC where A, B, C e V — I. 
Thus, suppling that all values t(i, j, k),have been computed for 
r < i where i is some integer greater than 1 and all j, k, 1 < j, k ^ rt, 
compute t(i,j, k) by taking the Boolean sum 

I v I t (MvMt(i^s,j + s, k 2 ). * ; . 

Z^Z^Z^P »=i 



One can see that t(i,j, k) = 1 if and only if for some production 
Z k ->Z kl Z k2 e P and some s between 1 and i - 1; both t(s, j. kj) 
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and t(i - s, j + s, k 2 ) = 1. Restating this in terms of the definition of 
T, Z k ^> aj ... a j+ |-i if and only if there*^ust Z kl , Z k2 e.V-X 
such that Z k ~>Z kl Z k2 eP and for some s, Z k| => 



Often the recognition matrix T-is modified to be a two- 
dimensional matrix R whose entries are subsets of V - X: Z k is in 
the entry R(i, j) if and only if T(i, j, k) == 1. For a fixed grammar G, 
the modified mafjrix R can be computed from a! ... a n in a number 
of steps that is proportional to n 3 , and so the question "Is w in 
'L(G)r.can be answered in that amount of time; When the answer 
is "yes," then a parse can be obtained by searching the matrix. 

There are many variations on the recognition algorithm de- 
scribed above, some depending on the form of the grammar, some 
on the type of machine used in implementing the algorithm, some 
on the Way the matrix is represented. One of the more recent 
variations employs a different representation of the recognition 
matrix, a reduction of the computation of the transitive closure of a 
binary relation to that of Boolean matrix multiplication, and "fast" 
matrix multiplication techniques to obtain a running time on the 
order of n 2+e . 

A wide Variety of parsing techniques are discussed in Aho and 
Ullman [3], [94] and in, Lewis, Rosenkrantz, and Stearns [61]. A 
careful analysis of the Cock^e-^asami- Younger algorithm presented 
in this section can be found in Grahairfand Harrison [51]. 

6. VARIATIONS^ON CONTEXTrFREE LANGUAGES • V 

We have noted three characterizations of the context-free^ 
languages: generation by context-free grammars, acceptance by 
nondeterministic pushdown store , acceptors, and "algebraic" rep- 
resentation in terms of a "generator," the Dyck set on two letters. 
.We now consider'both extensions and restrictions of the context- 
free languages based on these three methods df specification. Spme 
of the resulting classes of languages Jiave feceived considerable 
" attention in the literature. * 



6.1. Let us consider restrictions of the context-free fifpguages; 
One approach puts restrictions on the form of the rules of a gram- 
mar. " 



aj ... aj^,^! andZ k2 ^ aj +g ... aj+i- 1« 
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A context-free grammar G « (V, £, P, S) is linear context-free if 
each production in P is of the form Z-^aY/? where Z e V - S, 
a, p e Z*, and YL e (V - S) u {e}. A language L is a linear context- 
free language if there is a linear context-free grammar G such thdt 
. L(G) = L. ~ ^ 

The class of linear context-free languages is a rich class of 
languages that plays a^ important role in the algebraic theory of 
languages as well as in the study of languages specified by multi- 
tape Turing machines. Every regular set is a linear context-free 
language, the languages {ww R | w e {a, b}*} and {a n b n |n^ 0} are 
linear context-free but not regular, and the Dyck sets are context- 
free butnot linear context-frde. 

The linear context-free languages have a simple algebraic 
characterization. Consider a linW context-free grammar G = 
(V, S, P, S). Let the productions in F\be enumerated asr b r m , 
and let T == {t l9 t ft } be a set of m new symbols. Let R = {tj, ... 
t} n |n;> 1, each t jt e T, th^re is a derivation y 0 =s>y x =s> • •• =>y n in 
G such thaty 0 = S and y n e Z* and for each i = 1, n, rj, is the 
production applied in the step .=> yj. Let h t , h 2 :T*->Z* be 
the homomorphisms determined as follows: For r, in P, if v { is 
Z-+ aY/?, where Y e (V - S) u {e}, then h t (t|) = a and /i 2 ('<) = P- 

It is clear that if tj, ... t jn is in R, then there is a derivation 
S=> 7l => ... ^ 7n in G with 7„ e I* and h 1 (t jl ... t jn )h 2 (t jn ... 

= y„. Clearly every string in L(G) can be so representedvThus, 
L(G) = {hi(y)h 2 (y R ) | y e R). It is easy to see that R is a regular set; 
in fact, R is the language generated by the left linear grammar 
G'X(V- Z) u T, T, P', S) where P' = {Z->t,Y | the production 
r, in P is Z-> aY/? where Ye V .- Z} u {Z-» t, | the production r, 
in P is Z-> <5 where 5 e Z*}. 

^The construction sketched above yields one part of the following 
characterization. 

Theorem 6.1. A language L is linear context-free if and~only if 
there exist homomorphisms h t and h 2 and a regular set R such 
that L = {h 1 (w)h 2 (w R )|we R}. 

Using Theorem 6.1 and the fact that the class of linear context- 
free languages is closed^urider certain simple operations, one ob-. 
tains the following result^ 
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* * ' . 

Theorem 6.2. The class of linear context-free languages is the 
smallest class containing {ww R |we{a, b}*} and closed under 
homomorphism, inverse homomorphism, and intersection with 
regular sets. ' 

It should be noted that the class of linear context-free languages 
is not closed under concatenation orlinder Kleene * . For example, 
{a*b n a m b m | n, m 5: 1} is the concatenation of two linear languages 
but is not itself linear context-free. 

Now let us consider a class of restricted nondeterministic push- 
down store acceptors that accept precisely the linear context-free 
languages. These acceptors are restricted by allowing the push-; 
down store to make . only one change from writing (pushing) to 
erasing (popping) during any computation. Thus, the read-write 
head on the top of the pushdown store is allowed to make exactly 
one "turn" or "reversal." ^ 

It is easy to see that if L is a lineaf context-free language, then 
there is a nondeterministic w one-turn" pushdown store acceptor D 
with, the property that L(D) = L. We use the representation 
{h 1 (w)h 2 (w R )|w 6 R} for some regular set R and homomorphisms 
h x and h 2 . The pushdown store acceptor D reads the first part of 
its input as h^w), nondeterministically guessing a; string w and 
storing w on its "pushdown store while checking that w is in R by 
simulating a deterministic finite-state acceptor for. R in its finite- 
state control. Then D empties its pushdown store while reading the 
remainder of its input and checks whether this string is indeed 
h 2 (w R ). - / 

To show that the language accepted by a nondeterministic one- 
turn pushdown store acceptor is a linear context-free language, one 
can use a "nondeterministic finite-state'transducer" that is, a finite- 
state acceptor that is nondeterministic and that produces output. 
The string written on the pushdown store before the pushdown 
store .makes its turn is the output of a nondeterministic finite-state 
transducer, and the input read after the turn is made is the output 
of another transducer whose input is the contents of the pushdown 
store. The class of relations represented by such transducers is 
closed tfnder composition. If the set of input strings to a nondeter- 
ministic finite-state transducer is restricted to a regular set R, then 
the output is expressible as h^hj ! (R). n R') where R' is another. 
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regular set and li t and hi are homomorphisms. Applying Theorems 
6.1 and 6.2 in this situation yields the following result. 

Theorem 6.3. A language L is linear context-free if and only if it 
is accepted by a nohdeterministic dhe-turfrj* pushdown store 
acceptor. 

• *. ■ • - • ••■"*■ 

In characterizing the linear context-free languages we restricted 

the behavior of pushdown store acceptors by restricting the way 

that the pushdown store's read-write head could move. Another 

type of restriction that one can make is , to restrict the pushdown 

store's alphabet to a single letter. The resulting acceptor is called a 

"one-counter acceptor." The class of languages accepted by such 

acceptors can be characterized as the smallest class containing the 

Dyck set on one letter and 'dosed under homomorphism, inverse 

homomorphism, and intersection withftregular sets. 

The class of one-counter languages and the class of linear 
context-free languages are not comparable: {ww R |we {a, b}*} is 
not a one-counter language and the Dyck set on one letter is not 
linear context-free. 

If one considers pushdown store acceptors that are counters but 
also are .restricted to \>e one-turn, then one obtains the smallest 
class of languages containing {a n b n |n^>0} and closed^ under 
homomorphism, inverse homojnorphisni, and intersection with 
regular sets. 

While numerous other subclasses of the context-free languages 
have been studied, the classes described here are ubiquitous in 
formal language theory and the methods used in defining and 
characterizing them are typical of the restrictions imposed through- 
out the subject. 

6.2. Now we turn to the specification of classes of languages that 
are not all context-free. We begin by considering specification by a 
generative structure. 

- A rewriting system G = (V, Z, P, S) haS'tn alphabet V, a termin- 
al alphabet Z c.V, an initial symbol SeV-Z, and^a finite set P 
of productions (rewriting rules) of the form 

WiZxWjZ^... WtZtWt + ^WjyxWjyj ... w t y t w t + 1 - 
where each Wi e Z* each Z, e V - Z, each y { e V*, and there is 
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some i such that y^Zj. Define a binary relation => on V* as 
follows: for a, p e V*, if p-* 0 is in P, then <xpP => afl/f. Let A 
denote the transitive-reflexive closure of =>. The language generated 
by G is L(G) = {w e S* | w}. 

The notion of a rewriting system is very 'general. Productions 
ihay rewrite more than one symbol per step and the new parts of 
the string (i.e., y/s) may depend upon more than the single corre- 
sponding symbol Zj. It is useful to consider restrictions on the form 
. of the productions in a rewriting system. One restriction is to allow 
" only one symbol to be rewritten in a step. With this restriction, 
rewriting systems may also be viewed as generalizations of context- 
free grammars. * * 

This important extension of the notion of context-free grammar 
is obtained by adding "context," that is, a symbol Z may be rewrit- 
ten as y only if Z occurs with the string a on its immediate left and 
the string P on its immediate right. Thus, a "context-sensitive" 
rewriting i;ule has the form ocZ/J— ► ay/J, and we distinguish between 
the case where "erasing" is allowed, i.e., where y may be the empty 
word, and where it is not. * 

A context-sensitive (without erasing) grammar is a structure 
G = (V, Z, P, S). where V is an alphabet, ZcVis the terminal 
-alphabet, S e V — £ is the initial symbol, and P is a finite set of 
productions (rewriting rules) of the form <xZ/}— >&yP % where 
<V P, y e V*, y ^ e, and Z e V — £ If the restriction that y 7* e is 
dropped, then the grammar is context-sensitive with erasing. . 

Since context-sensitive grammars (with or without erasing) are 
rewriting systems, the notions of derivation and of language gener- 
ated by the grammar can be applied. 

In a context-sensitive (without erasing) grammar G = (V, Z, P, 
S), there are no 'rules that decrease length: if p->0 is in P,.then 
I p | < I 0\. It is clear that if L is a context-free language and e £.L; , 
# then there is a context-sensitive grammar G stich that L(G) = L. ' 
The following strategy has been used to extend the definition. to 
include languages Containing the empty word. Let G = (V, Z, P, S) 
be such that; S^ fe is in P (so that e e L(G)), P - e} has no 
> erasing productiefqs,. and S does not occur on the right-hand 
side of any production- in P (so that erasing cannot occur by 
ocZ/J => aySdp => ay 5/?). Grammars with such a restriction are some- 
times called "extended" context-sensitive. 

A language L \s context-sensitive (extended .context-sensitive) if 
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there is a context-sensitive (extended context-sensitive) grammar G 
such that L(G) = L. '.' y 

While the restriction that a production of a rewriting system 
^rewrite only one symbol^^5er step forces one to consider only 
context-sensitive with erasing grammars, there is no weakening of, 
generative cfapacity. A language is generated by a cpntcxt-serysitivc 
with erasing grammar if and only if it is generated by an arbitrary 
rewriting system. A" language is context-sensitive if and only if it is 
generated by a rewriting system in which each production has its 
rifcjit-hand side at least as long as its left-hand side. 

The hierarchy of grammars that goes from context-sensitive with 
erasing to context-sensitive to context-frge to left linear context-free 
(also called finite-state) is known as the Chomsky hierarchy of 
grammars. The corresponding hierarchy of classes of languages 
corresponds to the four classes of languages tjiat have been domi- 
nant in the theory of formal languages: the recursively enumerable 
sets, which are the languages generated by , the context-sensitive 
with erasing grammars; the context-sensitive (extended context- 
sensitive) languages; the context-free languages; and the regular 
sets, which are the languages generated by left linear context-free 
grammars. 

In Section 2 the class of recursively enumerable languages was 
defined to be the class of languages accepted by unrestricted Turing 
machines. We have noted above that the class of recursively enu- 
merable languages is thevclass of languages generated by context- 
sensitive, with erasing gimmmars. There are numerous algebraic 
characterizations of this class but many of those of interest in the 
study of formal languages and abstract automata stem from 
characterizations using automata. We will describe some, of these 
characterizations as they arise in our survey of automata. 

The class of context-sensitive (or extended context-sensitive) 
languages can be characterized by certain restricted Turing ma- 
chines. A linear-bounded automaton is a Turing machine that is 
restricted in such a way 'that in every computation the machine 
visits only those tape squares upon which the input is originally 
written. A language is accepted by a nondeterministic linear- 
bounded automaton if and only if it is context-sensitive. This pro- 
vides an automata-theoretic characterization of the^ context- 
sensitive languages. 
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The class of linear-bounded autdmata.has a ''universal Turing 
machine theorem"; that as, there exists a linear-bounded automa- 
ton U that will accept suitably encoded inputs M and x where M is 
a linear-bounded automaton and x is art input accepted by .M. 
From this fact one obtains an algebraic characterization of the 
context-sensitive languages in the following form. 

Theorem 6.4. There is a context-sensitiye language L 0 with the 
property that for every context-sensitive language L there exist a 
nonerasing homomorphism h b a homoiffbrphism h 2 , and a regular 
set R such that L - h { (hZ l (L 0 ) n R). 

Both the class of recursively enumerable sets and the class of 
context-sensitive languages are closed under many of the oper- 
ations arising in the study of the context-free languages. In par- 
ticular, both of these classes are closed under union, intersection, 
and nonerasing homomorphism. The class of recursively enumer- 
able sets is not closed under complementation but it is not known 
whether the class of context-sensitive languages is closed under 
complementation. Every context-sensitive language is a recursive 
set and every recursively enumerable set is the homomorphic image 
of a context-sensitive language so that the. class of context-sensitive 
languages is not closed under arbitrary homomorphic mappings. 

Generalizations* of grammars and rewriting systems have been 
obtained in several ways. One method is to regulate the way pro- 
ductions are applied, say by taking subsets of the set of productions 
and imposing an order upon the members of .each subset. Thus a 
specified sequence of productions must be applied one after, the 
other, and until the sequence Js exhausted, no other productions 
can be applied. Another method of generalizing these systems has 
been to allow context to occur anywhere in the string, not neces- 
sarily contiguous to the symbol being rewritten. 

Within the wide variety of studies of generative systems in the 
literature, one of. the most fruitful in terms ^of mathematical ex- 
plorations has been the study of "developmental systems and 
languages," often called "L-systems." Whil? originally created to v 
model certain phenomena in developmental biology, it ha^ become 
one of the most vigorous parts of formal language theory ; In. the j 
traditional notion of a rewriting system, oroductions are appjied- 

■ • .... -.,i:6o : ' ; r ; 
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sequentially, one production applied at each step. In developmental 
systems productions are applied in parallel. At each step every 
symbol that can be rewritten must be rewritten. 

The simplest type of developmental system is an il OL system." 
An OL system G = (V, P, x) has an alphabet Y, a finite set P of 
context-free productions, and a string xeV*- {c}, the axiom. A 
binary relation =*> on V* is defined as follows: 

If Z |f ... ,Z k e V and for each i = 1, ... ,k, Z { y, is in P, then 
Z t ... Z k s>y t ... y k . The transitive reflexive * closure of 
=> is . The language generated by G is L(G)=*{w e V*|x 4r 

w). v . , * ^ ; . • . _ 

Thus an OL system has no distinguished terminal symbols antrf 
productions are applied in parallel. . ^a, JL- ' 

Consider the OL system G = ({a}, {a-> aa}, a). In thu^ase'/ 
L(G) = {a 2 " | n £ 0}. Now G has only one production but jE^G) is 
$n infinite language that is not context-free. However a rewriting 
system with only one production generates either a singleton set of 
the empty set" , 

There are many variations on the notion of an OL system and 
both languages and sets of infinite sequences have been studied". 
Some of the language-theoretic results have been applied in the 
'biological setting that motivated the original study and this area 
has had serious impact on mathematical studies of growth and 
form..' 

6.3. The method of specifying classes of languages that has 
proved' most fruitful for defining new classes is that of acceptance 
by abstract automata. Starting with the definition of pushdown 
store acceptors, we can illustrate some of the methods used to 
' define new classes of automata. 

Recall that a pushdown store acceptor has an input tape that is 
read in only one direction, a finite set of -states, a pushdown store 
that provides auxiliary storage, and a transition function that de- 
scribes how the states change in accordance with the input symbol 
currently read and the information obtained from the auxiliary 
storage — in this case, the symbol on the top of the pushdown store. 
Further, both deterministic and nondetermfnistic modes of oper- 
ation are considered. Thus we see that several parameters are in- 
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volyed: (i) the method of reading input; (ii) the type of auxiliary 
storage; (iii) the mode of operation, (iv) the number of read heads 
on. the input tape. In the case of automata that read the input only 
in one direction, it is possible to allow still another parameter: (v) 
whether the transition function may force the acceptor to read a 
new input symbol at each step, that is, to operate in real time. 

Consider a type of automaton obtained by varying one of these 
parameters. An auxiliary storage tape that has the basic form of a 
pushdown store but which can be read below the top without 
erasing is called a stack. As with a pushdown store, a stack can be 
altered only at the top, that 'is, the read-write head Can erase the 
contents of a tape square only at the top and it can write only at 
the top, but the read-write head of a stack can visit the interior of 
the stack in the read-only mode. A one-way stack acceptor has an 
input tape that is read from left to right, a finite set of states, an 
auxiliary storage tape that is a stack, a transition function, and a 
set of accepting states. Since the interior of a stack may be read 
without erasing, a deterministic stack acceptor can accept 
languages such as {a n b n c n |n £ 1} and (wcw| w e {a, b}*}~that are 
not context-free. 

The class of languages accepted by one-way nondeterministic 
(deterministic) stack acceptors has many of the same positive and 
negative closure properties as the class of context-free (deter- 
ministic context-free) languages. There is an intercalation theorem 
for one-way stack acceptors that allows one to show that 
ka^b n2 c n2 |n is a language not accepted by s.uch an automa- 
ton and hence that the class of languages* accepted by one-way 
{nondeterministic (or deterministic) stack acceptors is not closed 
•under intersection. 

Since a pushdown store acceptor is a one-way stack acceptor, all * 
of the questions about nondeterministic pushdown store acceptors 
that are undecidable (e.g., Is L t = L 2 ? Is L regular?) are also unde- 
cidable when asked about one-way nondeterministic stack accept- 
ors. The class of languages accepted by one-way nondeterministic 
stack acceptors is a class of recursive sets and this class is closed 
under arbitrary homomorphic mappings. Hence, the emptiness 
problem is decidable. From the intercalation theorem one can 
show that finiteness is decidable. As in the case of deterministic 
pushdown store acceptors, the decidability of the equivalence prob- 
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1cm for one-way deterministic stack acceptors (i.e., Arc L(Mi) and 
L(M 2 ) equal?) is open. 

There arc restrictions on the definition of one-way stack accep- 
tors that yjcld classes of automata whose power of acceptance is 
incomparable to that of the nondctcrministic pushdown store ac- 
ceptors. One such restriction is to. force the stacl^ to be "noncras- 
y|^tliat is, the top symbol of the stack may be replaced by a 
s^rof ahd symbols may be pushed down, but thcrlcngth of the 
stack cannot decrease. The language {a n2 ^ n £ 1} is accepted (by 
final state) by a one-way deterministic noncrasing stack acceptor, 
and the Dyck set on two letters is not accepted by any one-way 
nondctcrministic nonerasing stack acceptor. Another restriction is 
that the stack's contents cannot be altered once the read-write head 
visits the interior of the stack for the first time; the resulting stack 
is called a "checking" stack and a one-way deterministic checking- 
stack acceptor can accept the language {a n b rt c n | n £ l). 

Just as with pushdown store acceptors, one can restrict the stack 
alphabet to-.bc a single letter, thereby obtaining a (nonerasing, 
checking) "stack counter." With any type of stack acceptor one 
may further consider the number. of times tjjfe stack's read-write 
head changes direction, obtaining "finite-turn" ) ("reversal- 
bounded") stack acceptors. 

Having taken any of those variations of a stack as auxiliary 
storage, one may consider the different classes of acceptors ob- 
tained by varying the parameters (i), (iii), (iv), and (v). In each case 
the languages so specified are recursive sets. 

It is usually the case that a class of languages specified by deter- 
ministic acceptors is closed under complementation if the aefceptors 
"can be made to halt," that a class of languages specified bv nonde- 
terministic acceptors is closed under union, that a ilass of 
languages specified by nondeterministlc acceptors witj* one-way 
input is closed under nonerasing homomorphism, and'that a class 
of languages specified by acceptors with two-way input is closed 
■ under intersection. If no restrictions are placed on the finite set of 
states, then usually the class of languages specified is closed under 
intersection with regular sets.. 

If an acceptor is allowed two separate storage structures, e.g., 
two pushdown stores or two counters or two stacks, then the re- 
sulting acceptor may have the full power of a Turing machine 
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unless some other restriction itf imposed. For example, the running 
time may be restricted by bounding it by a recursive function of the 
size of the input. The study of abstract automata operating with 
restrictions on such computational resources as running time or 
amount of storage space fall into the area of computational com- 
plexity which will be briefly described in Section 7. 

6.4. Many of the classes of languages studied in the literature 
share certain positive closure properties. In a number of cases, the 
proofs that the classes arc closed under these Operations are re- 
markably similar. This has led to the development of a theory 
emphasizing the study of classes of languages defined, by specific 
collections of closure properties, of classes of languages 
characterized as the smallest class containing a given base and 
closed under certain operations, and of the relationship between 
closure properties, of classes of languages aiftl characteristic proper- 
tics of the behavior of abstract automata. Thj^icdry is mathcmaU 
ical and mjieh-of its impact is on the theory of abstract automata 
and the theory of formal languages as theories that arc part of the 
mathematical foundations of computer science. k 

Ah abstract family of languages (AFL) is a class of languages 
containing at least one nonempty language and closed under non- 
erasing homomorphism, inverse homomorphism, intersection with 
regular sets,- union, concatenation, and Kleene 4- . A semi-AFL*is 
defined in the same way as an AFL except that closure under 
concatenation and Kleene 4- is not required. An AFL (semi-AFL) 
is full if it is closed under arbitrary homomorphism. 11 An AFL (semi- 
AFL) is (full) principal with generator L if it is the smallest (full) 
AFL (semi-AFL) containing the language L. 

By drawing upon the many classes of languages s^cified by 
grammars or automata for examples, the various subsets of the 
defining AFL operations have been studied in terms of their 
relative independence. The class of context-free languages provides 
an example of a full principal AFL (one possible generator is the 
Dyck set on two letters). The class of linear coptext-free languages, 
is a full principal semi-AFL that is not an AFL (since it is not 
closed under concatenation and Kleene + ). The class of languages 
accepted by one-way nondeterministic acceptors 'that -operate in 
real time and th^t have a finite number of counters as auxiliary 
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storage (each such acceptor has a fixed finite njumber of counters,, 
but there is no bound on the number of counters that the acceptors / 
in the class may possess) is an AFL that 0 is not full and is not 

\ principal; it is closed under intersection and under those substitu- 
tions T such that for any symbol a> T(a) is in the class and T(a) 

^ does not cohtaiaeA, . . ^ 

V In the study of abstract automata and. formar languages, many 
types of acceptors have been defined in terms of the parameters 
described above. The classes of acceptors that specify AFL are 
"abstract families of acceptors." 

.•A storage, schema (r, I, f » g) has a 0 nonempty set Fvof storage 
symbols, a. nonempty set I of instructions, a partial function f froip 
F* x I.toJT*.u {0} that determines how storage is to be altered;^ 
at each step of an acceptor's computation, and a partial function 
fron^F* to the finite subsets of tliat specifies what infbnnaticm^ 
' can be obtained from the storage in one step of an acceptor's" " 
computation. , * 

The storage manipulation (write) function f is restricted so that 
Tor each y in u{g(y)|.y'e P*}: there is an identity element l y in F l 
. with the property that f(y ; , lp ==y for alKy suet that y is in g(y'). 
Thilf condition provides a uniform prbcedure" following the stor- 
§ge to remain unchanged while* the acceptor manipulates the finite- ■;; 
state control or reads input. *" 
The ^torage 1 information (read) function g is restricted so that the 
<, empty storage cbnfiguration is distinguished from all other storage ^ 
■ vfcorifiguratioris. \ " '■ v : . 

: For a jstofage schema (F, I, f, g)"the abstract family +;of acceptors. 
(AFA) defined by (r, I, f, g) is' th^class of all acceptors (K, I, 5,q 0 , 
F) where K is ^ finite set of states' £ is an input alphabet, q 0 e K is 



the initial state, F £ K- is the 
Hon function 5 is a function 1 



fcet-of accepting states, and the transi- 
Grom Kx (I u {e}) x g(r*) intp the 
''" finite subsets of it x I such tliatfy|5(q, a, y) # 0 for some q e K, [ 
a e I u {e}},is finite; / e <. V V ... . ° : . ' 

An AFA- is a storage schema together with all acceptors with 
finite-state controLanfl a finite number of input symbols and auxili- 
ary storage specified by the storage schema. Each acceptor has an 
i)f^^^|t^ and a set of accepting states. 

^^^^ass.'6f nondeterministic^ushdown stoie acceptors defined 
i^^§#is an AFA. The storage; schema determines that the top T 

■ ... , " ■ 'V:^ 1 • " V ■ : . : ■ 
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of the pushdown store is read at each step of a computation by 
defining g(Zi ... Z n ) = Zj fpt:ii >1, each Z v e T, and g(e) = e^The; 
storage schema determines that only the top of the pushdown store 
can be altered. The definitions of instantaneous description and the 
"compute" relation for nondeterministic pushdown store acceptors 
provide examples of how these notions are defined for AFA. 

For each AFL JS? there exists an AFA such that the" languages, 
accepted by final state and empty store by those acceptors that (i) 
ar£ specified.by this AFA, and (ii) pperate in such a way that there 
is a fixed finite bound on the number of consecutive transitions that 
do not read new input, are precisely all and only the languages in 
t <£. If the AFL is full, then there is an AFA with the same proper- 
ties except that condition (ii) may be omitted^p^versely, in just 
thfs way every AFA specifies an AFL arid every/AFA specifies a full ^ 

; AFL when condition (ii] is ofliitted. 

The formal definition of AfA is cumbersome but the concept has 
■giyen rise to some intuitive* descriptions of the behavior* of ^utb- 
mata that have been quite fruitful I in , suggesting' both techniques 
and results. The reader sh6uld consider the nondeterministic push- 
down store acceptors as typifying the concept of AFA. 

'Since, only, nondeterministic acceptors with one-way input are 
specified by the definition, of AFA^ this theory "does not provide a 
. framework for a comprehensive classification of abstract automata 
studied in the literature. However, aspects^ of this theory have been 
veryj useful in studying classes of^languages specified by various 

; ; autonjata. ( / 

.'^•;;.' : ;IWiany' questions about classes of-languages have been studied in 
forms of* AFL (and AF^): AFL closed urider substitution or re- 
versal or intersection, A.FL such that every language is semi-linear, 

- AFL characterized as the smallest AFL containing some collection 
of "bounded" languages (a language L is bounded if there is some 

£k£ 1 and some strings y x , ... ,y k such ihat L s {y t }* ...;{y k }*), 

■? etc. Perhaps the most fruitful application of this theory has been to 
the study of subclasses of the class of context-free languages arid to 
AFL and semi-AFL that h£ve many of the structural properties of 
the class of context-free r languages. A rich algebraic theory has 
evolved through t% ^tudy^of operators that produce infinite hier-_ 
archies of classes of languages, when applied without limit to a base 
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Some of the characterizing properties of AFL and semi-AFL and 
some of the important examples of AFL and semi-AFL play an 
essential role in the study of automata-based computational com- 
plexity. A different abstract specification of certain classes of 
languages relating to the basic questions of automata-based com- 
putational complexity is described in Section 7. 

6.5. The text by Ginsburg [37] offers background on subclasses 
of context-free languages but for a broader treatment of classes of 
languages one should see the texts by Salomaa [82] and by Hop- 
croft' and Ullman [58]?Certain motivation for some of this devel- 
opment can be found in Chomsky [2i], [23]. The study of 
L-systems is "enjoying great popularity among those working in 
formal language theory. The text by Herman and Rozenberg [57] 
provides motivation and much, of the basic groundwork. Stack 
acceptors werfe introduced by Ginsburg, Greibach, and Harrison 
[40] and much of the early work on stack languages is summarized 
"inHopcroftandUllman[58]. > \ 

The** primary reference on the notion of abstract family of 
languages is the memoir by Ginsburg, Greibach, and Hopcroft [41] 
while a secondary reference is the book by Ginsburg [38] Exam- 
ples of the rich/algebraic theory that has been developed, can be 
found in papers by Goldstine [42]-[44] and by Greibach [46], 
[49] v [50]. ■; ^ :; 

7. COMPLEXITY CLASSES OF FORMAL LANGUAGES \ 

*The study of computational complexity is currently the most 
active area within theoretical computer science (see the article by 
Preparata in this Study for a general discussion of this subject). 
This area tfan Be, approached in different ways: an axiomatic ap- 
proach, "abstract^ complexity, that is a part of recursive function 
theory; the study of classes of languages accepted (or functions 
computed) by various models of 'computation with restricted re- 
source^ and* the relationships between thes^dasses; and the analy- 
sis ;of specific algorithms and the complexity of specific concrete 
functions or problems in terms of restricted classes of algorithms. 

In jthis section we fpcus on classes of formal languages specified 
by abstract automata with restricted computational resources. 

• -16/ '.'iV:':V "' ; : 
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; v/7;I. In studying computational complexity one of the first tasks 
is fl^clefine the concept of computational difficulty (or complexity). 
Such a definition requires a method of specifying or representing 
algorithms, as well as measure of cost that is applicable to that 
method of specification. For example, if one wishes to define com- 
putational difficulty by means of Tunning times of programs, then 
one must specify the class of programs allowed in terms of their 
structure and atomic operations (or in terms of a formal program- 
ming language) and Jhbu| individual steps are to be counted, i.e., 
what a step is and how much time a step takes. 

One goal of a theory of computational Complexity is to provide a 
measurfc (or even a definition) of the "intrinsic" complexity of a 
function to be computed or of a problem to be solved. Thus one 
would.like to define the complexity of a function in a way that is 
independent of the method of specification and the applicable 
measure of difficulty. This, goal suggests the need for formal com- 
parisons between different models of computation and their various 
measures of complexity. 

Two quite different types of measures haive been studied. One is 
a "static" measure. The standard example of astatic measure is the 
size of a program (i.e., the number of statements), a parameter that : 
does not depend on the input. On the other hand, a "dynamic" * 
measure such as the running time of a program does depend on the 
size of the input and thus describes the behavior of computations of 
the program instead of its structure. 

By studying these questions in an abstract framework based on 
recursive function theory, it has been sfiofyn that there exist func- 
tions with no intrinsic dynamic complexity : For every program 
that computes the function, there is another program that com- 
putes the same function, but runs faster oh infinitely many inputs. 
By studying certain concrete problems, it has been shown that 
there exist recognition problems whose minimum running times are 
invariant (up to a certain factor) under wide changes of model. 

In the study of formal languages and abstract automata, it ap- 
pears that the most important questions concern recognition prob- 
lems that are known to be not only recursive but even primitive 
recursive (in fact, subelementary). We restrict the discussion here to 
a description of the basic questions of automata-based compu- 
tational complexity, focusing on multitape Turing machines as the 
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basic model of computation and placing recursive bounds on the 
computational resources of time and space. % 

7.2, The study of automata-based computational complexity fo- 
cuses on the study of abstract computing devices, their or- 
ganization and their computational power. This .work is aimed ar 
understanding the dependence of computational difficulty on the 
properties of the cpmputing devices on which the computation is 
performed. Yhere are many models of computation and many com- 
plexity measures that have been studied, but the most influential 
and widely studied model is the multitape Turing machine with the 
running time of a computation (i;e., the number of steps in a com- 
putation) and the amount of space used in a computation (i.e., the 
number of memory cells visited) as complexity measures. 

Given a model and a measure, one .wishes to char acterize the 
power of that model with respect to bounds on the measure. This 
leads to the study of hierarchies of classes of languages accepted 
based on hierarchies of recursive functions that bound the measure. 
In the-case of multitape Turing machines, the possibility of hier- 
archies based on the number of tapes has also been considered. J 

With the mode of operation as a parameter, the question of the 
equivalence .of tlje deterministic and nondeterministic modes arises. 
With the assumption that the nondeterministic mode is usually 
mor^pwerful (at least when considering classes defined by recur- 
sive rounds on the time or space used by multitaperTuring ma- 
chines), the question takes a slightly different form: When usipg the 
deterministic mode of operation, what is the additional cost of 
recognizing a language originally specified by a resource-bounded 
machine operating in the nondeterminfstic mode? . t ' 

Given a model of-computation, one may consider tradeoffs be- 
tween measures. In the case of Turing machines, one 'wishes to 
know how much space (time) is necessary to recognize a language 
originally specified by a time-bounded (space-bounded) Turing ma- 
chine. 

While there are a number of other questions to be considered 
when studying automata-based complexity, the themes just indi- 
cated play a dominant role. We-turn to a brief survey of the known 
(partial) answers to these questions. ° 

First consider time-bounded'computation. In this case we often . 
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restrict attention to machines that have an input tape that is* read* 
from left to right, finite-state, control, and some finite number of 
auxiliary storage tapes. Without further restriction siich machines 
have the power of ordinary Turing machines and so the cl&ss of 
languages accepted by such machines is the class of all recursively 
enumerable sets. However a recursive bound on the running time 
forces the language accepted by the machine to be a recursive set. 
For this discussion such a recursive bound will be monotone in- 
creasing and a "real-time countable" function: A function f is real- 
time countable if there is a deterministic multitape Turing machine 
M such that upon input of length n, M runs for exactly f(n) steps 
and then halts. . ' 

For any real-time countable function f, let DTIME(f) (NTIME(f)) 
be the class of languages accepted by those deterministic (nondeter- 
ministic) multitape Turing machines whose running times are 
bounded by f. If f(n) = n, then DTIME(f) is the class' of real-time 
definable languages and NTIME(f) is the class of quasi-real-time 
languages. The class of languages accepted by those deterministic 
(nondeterministic) Turing machines whose running times are 
bounded by a polynomial in the length of the input is denoted 
by P (NP), so that P = (J k2t l DTIME(n k ) (and *NP = 
U k ,iNTIME(n k )). " ' 

The classes DTIME(f) and NTIME(f) have been defined without 
restricting the number of auxiliary storage tapes used by the ma- 
chines. In the case of nondeterministic machines, it is sufficient to 
' restrict attention to machines with only two auxiliary storage tapes 
(where one may be a pushdown store and the other a stack). In the 
case of deterministic machines that operate in real jtime, the 
number of storage tapes cannot be so restricted: For every integer 
k > 1, there is a language that is accepted in real time by a deter- 
ministic Turing machine with k + 1 auxiliary storage tapes but not 
by any*deterministic machine with only k storage tapes that oper- 
ates in real time. For other time bounds it is not known whether 
additional storage tapes provide additional computing power. 
However, iif one is willing to sacrifice time, then one can restrict the 
number of storage tapes: every language in DTIME(f) can be ac- 
cepted by a deterministic machine M x with only two storage tapes 
that operate in time f(n) log f(n) and by a machine M 2 with only 
one storage tape that operates in time (f(n)) 2 , 

,. ~ .170 ; . 
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Given a time bound f, how much larger must g be to ensure, that 
DTIME(g) — DTIME(f) is not empty or that NTIME(g) — 
NTIME(f) is not empty? Using the method of "enumerate and 
dia^onalize" just as it is used in showing that there are recursively 
enumerable sets that are not recursive, one can show that if lining 
f(n) log f(ri)/g(n) = 0, then there is a language in DTIME(g) that is 
not in DTIME(f). It is known that one can do more in linear time 
than in real time: there is a deterministic context-free language that 
is not in DTIME(n). Thus, the case of real-time computation is 
quite special since 

(J DTIME(cn) * DTIME(n). 

oo & 

However for any running time f such that lim n ^ 0 n/f(n) = 0, 
Q c>0 DTIME(cf) == DTIME(f).^ 

In the nondeterministic case things are. different. For any 
time bound f, NTIME(f) = (J c>0 NTIME(cf). Also, if lim^ 
f f (n)/g(n) = 0 and lim^oo sup f (n + l)/g(n) < op , then NTIME(f) £ 
NTIME(g). 

As noted in the discussion in _the previous sectipns,_there are 
classes of automata for which the deterministic and nondetermin- 
istic modes of operation have the same power of computation (e.g., 
finite-state, acceptors, unrestricted Turing machines) and there are 
classes of automata for which the Nondeterministic mode of oper- 
ation provides more computational power than the deterministic 
mo^e. (e.g., pushdown store acceptors, one-way stack acceptors)/ 
For any time bound f it is easy to see that 

DTIME(f) <= NTIME(f) s |J pTIME(2 cf ). / 

c>0 

It is not known whether these inclusions are strict for any time 
bounds- at all. Nor is it kno,wn whether there is a time bound f such 
that . ' *\ v • 

(J DTIME(f k ) =. (J NTIME(f k ). - A ■ 

While these problems have been studied in automata theory for 
many years, they have received renewed attention due touthe in- 
terest today in the study of P and NP and the many attempts to 
show thatP NP. While many of those studying P and NP do so 
from the standpoint of concrete complexity, there are a number of 
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questions about formal languages and automata that are closely 
related. For example, (i) a language L is in NTIME(n) if and only if 
there exist three deterministic context-free languages L x , L 2 , and 
L 3 and a nonerasing homomorphism h such that ' 

. ° n L 2 n L 3 ) = L 

and (ii) P = NP if and only if P is an°AFL if and only if 
NTIME(n)cR 

Now let us consider the effect of placing bounds on the ^g^nt 
of space used in a computation. Just a sflfc h time bounds, a recur- 
sive bound on the amount of space i^TO in a' Turing machine's 
computations forces the language accepted by the machine to be a 
recursive set. For this discussion such a - recursive bound will be 
monotone increasing and a "tape-constructible" function: a func- 
tion f such that f(n) ^ log n is tape-constructible if there 'is a deter- 
ministic multitape Turing machine M such that" upon input of 
length n, M marks exactly f(n) tape squares on the some one 
distinguished storage tape and then halts, while visiting no more 
than f(n) tape squares on'any of its storage tapes. 

In the case of space-bounded machines, we shall consider multi- 
tape machines with a distinguished input tape that can be read in 
both directions (that is, an input tape with a read-only head that 
can move both left and right). When the space bound f(n) is such 
that f(n) ^ n, the ability to read input in both directions during a 
computation provides no additional computational power. When* 
f(n) does not grow as fast as n (e.g , f(n) = log n), the situation can 
be quite different with the ability to read input in both directions 
providing a great deal of additional computational power. 

For any tape-constructible function f, let DSPACE(f) 
(NSPACE(f)) be the class of languages accepted by those dejgr- 
ministic (respectively, nondeterministic) multitape Turing machines 
whose work space is bounded by f. In the case f(n) = n, the ma- 
chines are called (deterministic or nondeterministic) linear-bounded 
automata. 

The classes DSPACE(f) and NSPACE(f) are defined without re- 
stricting the number of auxiliary storage tapes used by the ma- 
chines. In both the deterministic and nondeterministic cases, it is 
sufficient td restrict attention to machines with only one auxiliary 
" storage tape. 4 
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Given a space bound f, how much larger must g be to ensure 
that DSPACE(g)-DSPACE(f) is not empty or that MSPAGE(g)- 
NSPACE(f) is not empty? Again using the method of "enumerate 
and diagonalize," one can show that if lim^o, f (n)/g(n) = 0, then 
there is a language in DSPACE(g) that is not in DSPACE(f), so 
that DSPACE(f) $ DSPACE(g). Combining this result with certain 
results on "simulation" and "translation," one can show that if 
lim^ (f(n)) 2 /g(n) = 0, then there is a language in NSPACE(g) that 
is not in NSPACE(f)>so that NSPACE(f) $ NSPACE(g). * 

When we compare the deterministic and nondeterministitf modes 
of operation for space-bounded computation, again 
we do not kriGyJjSSvhether there is a space-bound f such that 
DSPACE(f|=NSPACE(f). However, it is known that for any 
space-bound JffNSPACE(f)cDSPACE(f 2 ). It is not known 
1 whether this iiiclusion is strict or whether! 2 can be replaced by 
f 1 +e for so$\p e < 1 or by a function such as f (n) log f (n). 

In the case of the space-bound f(n) = n, the question of whether 

_ DSPAQBn) and NSPAGE(n) are equal has received a good deal of 
attention^Rjje class NSPACE(n) is exactly the cjass of context- 
sensitive (without erasing) languages and the question of equality of 
DSPACE(n) and NSPACE(n) is usually referred to as the "LBA 
problem" since Turing machines operating in space-bound n are 
equivalent t£> linear-bounded automata. Thus, 8 this question is often 
stated as follows: Is every c<5^text-sensitive language accepted by a 
deterministic linear-bounded automaton? ?,V$ 

i Si nce for any space bound f, NSPACE(f) £ DSPACE(f 2 ), we see 
that (J k2l DSPACE(n k ) = UkVi NSPACE(n k ). This class is often- 
referred to "as PS PACE. ^ 

A good deal of effort has been directed toward the general 
question of equality of DSPACE(f) and NSPACE(f) by focussing 
on the special case where f(n) = log n. It is known that DSPACE 
(log n) = NSPACE(log n) if and only if for all space bounds f, 
DSPACE(f) = NSPACE(f) (recall that we required a tape-, 
constructible function f to be such that f(n) ^ log n). By using 
certain encoding techniques and by considering DSPACE(log n) as 
a class of languages and examining its closure properties, the fol- 
lowing statements have beeft shown to be equivalent: (i) DSPACE 
(log n) = NSPACE(log n); (ii) DSPACEflog ri) is closed under 
♦ Kleene + ; (iii) every linear' context-free language is in DSPACE 
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(log n); (iv) every language accepted by a nondeterministic tfne- 
counter acceptor is in DSPACEflog n). 

73. In order to compare classes of languages, it is not always 
^necessary to examine the entire class. Determining the inherent 
complexity of some one language in the class may provide enough 
information to determine the complexity of the entire class. For 
example, to determine whether P equals NP it is sufficient to deter- 
mine whether one specific language is in P. 

A statement in the propositional calculus is in conjunctive normal 
form if it is a conjunction of clauses where each clause is a disjunct 
of propositional variables and negations of* variables: A statement 
in conjunctive normal form is satisfiable if there is an assignment of 
truth values to the variables under which the entire- statement is 
"true," that is, under which every clause is "true." The set of state- 
ments in conjunctive normal form that are satisfiable can be rep- 
resented as a language L 0 when the' propositional variables ar r e 
encoded as strings over an alphabet. 

It is easy to see that L 0 is in NP: A nondeterministkrTuring 
machine (in fact, a nondeterministic one-way sfcck acceptor) can 
"" w giSss^"anirssignihcnt"--of truth values to the variables in -a given 
statement and then determine whether the assignment yields the 
value "true" for the statement. All known deterministic algorithms 
for recognizing L 0 take exponential time; but L 0 -has the property 
-that-L^-is in-EliLaad^o_nl v if P = NP. T his result is established by 
showing that for every language L in NP there is a function f L such 
that f L can be computed by a deterministic Turing machine in 
polynomial ^ tiirie7.and ! (L 0 ) = L. Thus one can reduce the 
question "Is w in L?" to the question "Isf L (w) in L 0 ?" Supposfe that 
L 0 is in P. Then there is some k > 1 and some deterministic Turing 
machine M 0 recognizing L 0 and running in time n k . For an arbi T 
trary language L in NP, one can construct a deterministic Turing"* 
machine M L such that on input w, M L first compiles f L (w) and 
then simulates M 0 on fjw), so that M L accepts yyn and only if M 0 
accepts f L (w). Thus, M L recognizes L and since f L can be computed 
in polynomial time, say time n l , M L runs in time) w| l + | w| lk . 
Hence, L is in P. v 

Any language L in NP with the property that {f rl (L)|f can be 
computed in polynomial time by a deterministic Turing ma- 
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chine} = NP is called NP-complete. Many problems from logic, 
combinatorial mathematics, operations research, and the theory of 
automata have been'shown to be NP-complete when suitably' rep- 
resented as formal languages. 
The discussion above can be summarized as follows. 

Theorem 7.1. (a) Some NP-complete language is in P if and only 
ifP = NP. ; ' 

(b) The set of conjunctive normal form statements that are satis- 
fiable is NP-complete. 

Generally if .£? is a class of languages and # is a class of func- 
tions such that tf'^jSP) c jjf^then a language L 0 in jSf is JSP- 
complete {ox complete for & if ^({Lo}) - ■ 

We shall consider other important classes of languages and 
problems (or languages) that are complete for these classes. • 

Recall from Section 2 that questions about regular sets are easily 
decidable if the sets are specified by deterministic finfte-state ac- 
ceptors. When a regular set is specified by a regular expression, it is' 
usually the case that a deterministic finite-state acceptor is con- 
structed from the expression before testing membership, finiteness, 
etc. Here we consider the complexity of questions about regular 
expressions. 

If E is a regular expression, let L(E) be the set of strings denoted 
by E. For any alphabet 2, let IN(2) .= {L(E) | E is a regular ex- 
pression over £ such that L(E) ^ 2*}. 

' • ' ~ " " -— • v:.; ; ' _ ~ — " ;v t ;.^t^ 

Theorem 7.2. The language IN({0,1}) is complete*- for 
NSPACE(n), and DSPACE(n) = NSPACE(n) if and ohly^if 
IN({0, 1}) is inDSPACE(n). 

Theorem 7.2 says that testing whether a regular expression over 
(0, 1} does not denote the set {0, 1}* can be done nondeter- 
ministically using linear space and that this test can always be 
made deterministically in linear space if and only if every context- 
sensitive language (that is, every language in NSPACE(n)) can be 
accepted by a deterministic linear-bounded automaton. 

The proof of Jheorem 7.2 can be outlined as follows. Given a 
nondeterministic Turing machine. M with exactly one tape and one 
read-write head such that during a computation only thjat portion 
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of the tape where the input is originally written is used, and given 
an input string w, construct a regular expression E over an alpha- 
bet £ such that w is accepted by M if and only ifL(E) & E*. There 
is an algorithm to compute E from M and w that operates in linear 
time so that the length of the expression E is bounded by agon- 
stant multiple of the length of w. The regular expression E has the 
form E t + E 2 + E 3 where (i) the regular expression E x denotes the 
set of strings that cannot encoxie accepting computations of M on 
w because they do not begin with the initial instantaneous descrip- 
tion ot v M on w, (ii) the regular expression E 2 denotes the set of 
strings that cannot encode accepting computations of M because 

• they do not end with an accepting instantaneous description, and 
(iii) the regular expression E 3 denotes the set of strings that cannot 

b encode accepting computations of M because they are not of the 
form of a sequence of instantaneous descriptions such that each 
follows from.thp previous one. Recall from Section 3 that the set of 
"noncomputations" pf a one-tape one-head Turing machine is a 
context-free language; this fact.was used to show that the question 

/Tor a context-free grammar G, is L(G)L»equaI to Z*?" is undecid- 
able. Here the' fact that the linear-bounded automaton M visits no 
more than | w| tape squares in its computations on w allows one to 
construct the regular expression E 3 . Note that since w is given and 
M is a linear-bounded automaton, each instantaneous description 
in any of M's computations on w is of length | w| + 1. The fact that 
it is decidable whether M acgepts w is translated to the fact that it 
is decidable whether E is equivalent to ?*, and Theorem 7.2 shows 
that thi$ decision problem "costs" nondeterministic linear space. 

It is easy to see that there is a function f that transforms any 
regular expression E over Z to a regular expression f(E) over (0, 1} 
such that L(f(E)) ^ {Q, l}* if and only if L(E) ± I*; f simply en- 
codes symbols from E as strings in (0, 1}*. The function f can be 
oomputed in linear time by a deterministic Turing machine so that 
for any regulaf expression E, the expression f(E) has length bound- 
ed by a constant multiple of the length of E. Thus, "Is L(E) not 
equal to 2*?" and "Is L(f(E)j not equal to {0, 1}*?" both have the 
sanie space complexity. 

An algorithm that runs in linear time uses at most linear space. 
Hpce, 1^(0, Q) is in DSPACE(n) if and only if DSPACE(n) = 
NSPAGE(n). " " , 
v ; • '. ■•■ -\V ' 
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For any alphabet I, I* is a regular set; so Theorem 7.2 shows 
that the question of inequivalence of regular expressions requires at 
least nondeterministic linear space. In fact, this question is also 
complete for NSPACE(n). 

There are restrictions of the problem of inequivalence />f regular 
expressions that are* of interest. In particular, if the regular ex- 
pressions contain no occurrence of * or if the alphabet is simply 
{0}, then the question of inequivalence is NP-complete. / 
• All of the questions considered so far have been directed toward 
nondeterministic computation: One obtains either an answer "yes" 
(or "accept") or no answer at all. Thus we see that NP is closed 
under complementaticm if and only if.the set of conjunctive normal 
form statements that are not satisfiable is in NP, and the class of 
context-sensitive languages, NSPACE(n), is closed under com- 
plementation if and only if {E | E is a regular expression over {0, 1} 
such that L(E) — {0, l}*y is ^context-sensitive.. These techniques 
have also been used, to compare time and space. 

Now we consider briefly the complexity of questions about 
context-free languages. As noted in Section 5, the membership 
question for a language specified by a Chomsky Normal Form 
grammar is solvable in time n 2+8 , that is, every context-free 
language is in DTIME(n 2+e ). Also, it is known that every context- 
free language* is in DSPACE((log n) 2 ) and that there is a linear, 
context-free language L such that L is in DSPACE(log n) if and 
onlyifOSPACE(logn)-NSPACE(Iogn). . 

There is a "hardest" contex o t-free language^ language L 0 such 
that for any time-bound f, L 0 is in DTIME(iETit and only if every- 
context-free language is in DTIME(f), and for any space-bound g, 
L 0 is in DSPACE(gj (NSPACE(g)). This language, a nondetermin- 
istic version of the Dyck <set D 2 , is complete for the class of 
context-free languages, where the function used to "reduce:" an arr 
bitrary context-free language to L 0 is a homomorphism. That is, 
{h " l ( L o) h " r ( L o ~v {e}) | h is a homomorphism} is theclass of all 
context-free languages. , 

A class of languages closed under inverse homomorphism and ; 
intersection with regular sets is a cylinder. The smallest cylinder 
containing a given language L 0 is a principal cylinder with gener- 
ator L 0 . If JSf is a principal cylinder with generator L 0 , then the 
Mime or space complexity of the membership problem for a 
language in S£ is determined by the complexity of the membership 
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problem for L 0 . Thus the class of context-free languages is a prin- 
cipal cylinder, as are DSPACE(n*), NSPACE(n k ), and NTIME(n k ) 
for any k. On the other hand, the class of deterministic context-free 
languages and the class of linear context-free languages are not 
principal cylinders, nor are P and NP. 

Returning to the context-free languages, recall that the emptiness 
problem is decidable. It can be shown that this question (when 
suitably encoded) is complete for P; in this case, the functions used 
are computed by deterministic Turing machines that use at most 
log n space. ' * . 

7A Borodin [16] has provided an excellent overview of the 
study of computational complexity. The text by Aho, Hopcroft, 
and Ullman [4] touches upon many of the topics studied iti this 
field. A pioneering paper by Hartmanis and Stearns [54] still plays 
an important role in the study of automafe-based complexity. 
"The real-time definable languages were studied by Rosenberg 
[78] and the. quasi-real-time languages by Book and Greibach 
[15]. The hierarchy for deterministic real-time machines based on 
the number of tapes was suggested by.Rabin [If] and established 
by Aanderaa [1]. The hierarchy for deterministic machines based 
on running times was established by Hartmanis and Stearns [54] 
and improved by Hennie and Stearns [56], while for nondetermin- 
istic machines Seiferas [86] strengthened the initial results of Cook 
[27]; 

The importance of the "P — ?NP" question was pointed out by 
Cook [26] and underscored by Karp [59]. Some connections be- 
tween language theory and the "P — ?NP" question can be found 
in Book [11]. 

Classes specified by space bounds have been extensively studied. 
Hierarchy results can be found in Hartmanis .and Stearns [54] v 
Stearns, Hartmanis, and Lewis [89], and Seiferas [86], [87]. 
Savitch £83] established the best result on deterministic simulation 
of nondeterministic machines known to date. Translations between 
classes specified by different space bounds can be foiftid in Savitch . 
[83], [8£j£*rid Book [12M1 4], Problems relating to "DSPACE 
. (log n) = ?NSPACE(log n)" have been studied by Sudborou^t [90] 
and Mohien.f66]. . 

The inherent^pomplexity of questions about regular expressions; 
was explored ^y Me^r and Stockmeyfer [65] and extended by 
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1 V • < ^. • v .y -. ■ ' ; 

have found useful tools in fornfial jangifage "theory are des^nbed in i 

'"d collection of papers edited by .Yeh [93].^ " ; '■■ ,.; 

Paper^fn formal language theory ate published in a wide variety 
of joiirnals^but in particular in |he Journal of Computer and System 

i Sciences, Mathematical*^ 
Science with occasional [ papers 'in the JoiitnfyJ of the Association for 
Computing : : Ma<Mn&y, §lAM Journal of Computing, Acta Inferjna- ■', 
tica, and Information and' Control. Mariy of the results jure rfcpofted 

^in>yarious symposia but particularly in the annua) IEEE $ympos- . 

*■ iurii on Foundations of Computer Science, (fpr&rly 

■ Symposium on Switching and Automata Tfiheory) and in the InterV 
national Colloquium on Aufom^a, Languages, and Programming, 
sponsored by tho European Association for Theoretical Computer 

?T^cience. Papers in formal language theoryliave also been presented 
at the annual ACM Symposium on Theofy of Computing and the 
Conference on Information Science and'Systems. 

Since this paper/was first- written, a number of important pub- 
lications in formal language theory have appeared. Harrison [98] 
has writter^ah introduction to rpany aspects of ^fce . field. .Greibach c 
[99] has written- a scholarly history of the early development' of 
formal language tHeqry that illustrates how it developed from com- 
putational linguistics into a part of theoretical compttter^cience. A 

; . . bobk by Berstel [100], develops . the- theory;- of ^ntext-frefe 
languages from thf algebraic standpoint by using the potion or 

• .ratibrial transduction/ Rozenberg ; a^^ 

mathematical development of L-sysiems, and Wp^.d ,[1025 shows|* 
how abstract notions of a grammar are related'to t-systems^ Sal'6-< 
^aa %i03] has brought together a. number 6f combinatorial ' 
•properties, of formal languages in a delightful; way , f A sympQsiiim oh 
Wmal language theory was held in December A 979 with the goal 
of putting past work into perspective and reportmg'pn the status^of 
open problems. The Proceedings. £104] cpntaifis the texts of thir- 
teen invited lectures which speak to these goals; ;/ • % £ 
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FORMAL ANALYSIS OF 
COMPUTER PROGRAMS 

Terrence W. Pratt 



w "> The goal of this chapter is to provide a brief excursion into some 
A pf the -problems and ^ analysis of computer 

- programs^ Most 6f the work of^iterest is relatively recent, aftd 
RjnuchJs of a tentative knd exploratory nature. As with most devel- 
1-i oping research areas, there is substantial disagreement over even 
^ ' ^ are and how they should be approached. 

to present any of the , 
^ciWrerit formal theories in depth, as their value may be quite trans^_ 
V itory. Instead, an alternative; structure seems more appropriate: an ^ , 
f , approach that begins with ^n exposition of the problems them-- 
^selves, followed by ar survey of "some of the fprnial\a]^j^paches fa 
o the/ solution ;6f these problem^ that are under develomnent^f 
C c6ijtse : thevriM^£^obIem's to be solved is considerables-^rthis 
r ysurvey, fqur problems have been chosen the basis for discussion, 
four problems that seem to encompass! a substantial part of the 
work of interest. For the reader whose interest is aroused by any of 
thejopics raisedjithe ; concluding section suggests further readings. 



\ :^is~^ox%M&^pp^d to part by the National Science EpundfUon under 
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1. PROGRAMS AND PROGRAMMING LANGUAGES. 

A computer is a tool ordinarily used to perform quickly and 
accurately some complex or tedious computations. One has a set of 
input data, upon which a well-defined sequence 6( operations is to 
be performed to compute some output data of interest. In order to 
specify the structure of the input and output data and the compu- 
tation to be pcrformcd'a program is written according to the rules 
of some programming language. The programming, language is 
simply a predefined notation for the specification of programs. 

Let us take a particular problcnr^and a program for its solution. 
Suppose we wish to determine, to within a given accuracy, the 
quotient of two positive real numbers, where the quotient jjs be- 
tween zero and one. The input data for thoprogram set of 
three numbers, representing the dividend, divisor, ancfthe desired '?■"; 
accuracy. The output data from the program, is a list of the input \ 
data followed by a single real number, the quotient. Figure 1 gives 
one program for making such a computation. The algorithm is one 
developed by Wcnslcy [25]. The program is written in the pro- 
gramming language PASCAL, i.e., the notation used to express the 
algorithm is . that of the PASCAL programming language [11]. 

ffi FOUR CENTRAL PROBLEMS IN THE ANALYSIS OF PROGRAMS \ 

" • ■ ' ■' /'• ''■ . * ' jft 

Tha^ample program is neither particularly complex nor partic- v 
^ s ^^ T1 . e PMhe central issues in the analysis of pro- 
: ^gr^ The four that concern us are the 

ylpfiowing: ■■7- . ■ ; 

ai. What does the r ndtation mean? The first problem jn construct- 
ing a detailed and precise analysis of the program lies in- the nota- 
tion, the programming language. The program is to be taken as 
representing a set of instructions for a computer, but what are the 
instructions, and what exactly is the computer being instructed to * 
' •do?; For example, we seS in Figure'! the program segments : 

' V while E < D do ^ ^ 



mightlRm 
programming language 




program Division(input y outpi{t)' t . 
var R y M,N,E\ real; 
]';. ': function Quot(P % Q % E:xzvA): real; 

' var A,B t D,Y'. real; 
' ■ begin 4 :- 0; 

■^u^.^'f^; a- ■■■■ Z) 1; \ . 




begin / 

if P £ /I + B then be£n V : : = Y + Z>/2? / 

, „ " >4 .: = • >4 -KB r ^ 
od; 

B: = B/2; 

* end; ^, • 
be&n*readi^yN,E); ' ' ^ 

"■. \' write (M^mM^ - : . 
•end. " - 4 

v FlC 1 . A PASCAL program to compute Quotients. ] 
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prcygrraB pijl^rtain^aiPliPs what function the 

program^g| Pfcbmpute^ilmmediately we are led to tK$ 
closely relatec 

c. Is the program correct! Presumably the program is intended 
to compute fcome specified function, i.e., given a certain input data 
set, it should produce the appropriate output dat£ set. But does it 
do so for the entire range of input data sets? Again it. plight seem, 
and certainly would be^^ed^ffeMhe'arisw^r would be readily 
obtainable. But again, the truth is far different — most programs, 
when originally designed and written, are incorrect, but the fact is 
hidden deep in the complexity of the program structure. Most of 
the errors are detected only after a tedious, time-cohsuming, and 
essentially a4 hoc testing procedure; some are never detected at all. 
A central issue in the formal analysis of programs is tjiat of provid- 
ing methods for determining the correctness of programs direg A, " t ^ 

d. Is there a more .''efficient" program io^mpuie.the 
nonjl A final issue of considerable impor&tra^Ts 4tiat<x>f fir 
better versions of the program— programs that represent the same 
function but that have fewer instructions or are less costly in some 
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other measure. Here we confront the question of whether two:prcfo. 
grams compute the same function, i.e., after we modify the pro- 
gram, how can we be sure, that the new version still computes the 
.same function- as the origifiaf? And if it,docs, ho>V can we be sure 
that it is really better than the original? 

All of these problems need and arc amenable to formal treat* 
mcnt. . 

3. DEFINITION; m PROGRAMMING LANGUAGES . ^ 

Let us now take up the problems mentioned in the previous* 
section one by one, considering the problem^and difficulties en- 
countered in the application of mathematics to each. The. first 
problem is that of providing precise and complete definitions of the • 
programming languages used to write programs. Clearly, without 
such definitions efforts at careful analysis of programs to determine 
correctness or efficiency must surely fail. First, a bit of higtory will 

^clarify why current programming languages lack precise definitipns. 

Programming languages such as FORTRAN, COBOL, ALGOL, 
and PL/I (to name just four of the many hundred^in existence) 
developed primarily as solutions to a practical problem, 7 as< simple 
expedients. The practical problem was, and still is, that computers 
are very difficult to use directly. A typical computer ha^ built into it 
the capability to carry out programs of instructions written in a 
primitive "machine language." Typically the instructions in a ma- 
chine language consist only^of patterns of binary digits, e.g., the 
binary sequence 001 1001 1100000(^^ as 

, an instruction to add two numbers stored in the internal memory 
ofr the computer and store the result in a third storage location in 
the same memory. Early users of computers wrote programs di- 
rectly in such machine - languages. However, the task was extremely 
tedious and error prone. It was soon recognized, that Abetter 
"languages" for writing programs wer$ essential. Here the general 
purpose nature gLcomputers was btought to bear: If was clear 
that, once a betfljjhotation for writing programs was 'devised, a 
gingie program (wnuen. in machine language) could then be execu- 
ted; by the computer whose function' ^ translate the 

^j^^4W /rom the new notation into instructions in the .machine 

1^ a 
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second step, 6k executed by the computer to produce the desired^ 
computation;: Any competent programmer could devise a notutipn^l 
a "programming language," that would be suitable for his purposes; | 
l^c could then write the appropriate translation program to trans- 
late his ndw programming lanWage into machine language, and 
immediately his subsequent programming could be entirely in the 
new programnling language-^e i^ver again, or perhaps only oct 
casionaljy^ would need to go back\to. the difTiculties. of program- 
ming in. machine language. Once t^basic concept of a compiler 
program, a program to translate a program in one notation into a 
program in a machine language (or another, already available, prQ- 
gramming language), was grasped, and accepted, numerous • pro- 
gramming languages were designed and "implemented" on: different 
computers. Subsequently, the need to move prbgraqis between. dif- ^ 
ferent computers and the need for programmers to communicate 
with each other has led to some standardized languages, such as 
FORTRAN and, COBOL, for which compiler programs arc avail- 
able on most computers. . . 
1 For our purposes, the central fact about these programming 
languages is the ad hoc method of their design and definition. If we 
ask, What is the meaning of this or that statement in, a program?— 
i.e.,' If I write this down, what will it cause the computer to do? — a 
precise answer is generally not to be found except in the following 
way: The computer will perform those instructions produced by 
the compiler when it translates the given statement into machine 
language. Since the actions the computer will take for each instruc- 
tion are completely determined by its internal structure (the 
manner in which its circuits are connected), and the instructions 
produced by the compiler are completely determined by its internal 
structure (i.e., by the actions generated by .instructions df the com- 
piler prpgram when it is executed by the computer), the "meaning" 
aniterrf^of actions to be taken by the computer is completely and 
^'^^j^^usly defined. "Unfortunately, it is quite impossible to 
*,t^1«3^&of this definition of the language as given by the structure 
t>f tfi^computer and" the compiler; both are entirely too compU- 
cAted^to be comprehensible.. The computer may Ijpve miHions v bf 
circuits and the compiler may contain thousands of instructions. 
Thu^ like the physicist who knows all the laws of motion. yet still 
cannot Htch y the. baH^w.e know 'the computation* invoked by a. 
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; program is cQmtylqtely and prccisflly determined, yet we still cannot 
^ predict what the^program will do. Clearly the defijnition of a pro-: 
gramming language by a compiler and computer is unacceptable; 
its precision is of little valufy , i ; 

The value of a precise, complete, and intelligible dcfinition v #f : iC ]\ 
►'-programming language was recognized early. The .development of 
, adequate mathematical tools for constructing such definitions has .. > 
been $loM/,*1i9yvcvcr. |>argc early successes were obtained in the . 
development ;^r:a ; formal theory of tfic syntax of programming 
:Janguagc^(a^t(ipic- discussed ih another chapter),' most notably in . 
■ i 'thc theory r of context-free grammars and thcyr derivatives. Thus it 
V;> has long been possible to give ^precise, complete*'^ 
^defimtion of what it is allowable to write in a ^^0^^^ivtn * 
" language,, which constructs arc allo.wcd,. : and Whlcli^%^V,' how the 
statements arc punctuated, etc. What has not been possible in any 
'widely accepted form is precise definition of the meaning of each 
; statement or athcr syntactic construct. '. # 

A typical programming language may have from ten . to .fifty or 
more different* types of statements; declarations* expressions, etc. 
. The meaning of each of these constructs is 'usually closcjy tied, to 
the context in.,which it occurs* so that the same statement appeal 
^ ing in different contexts represents a diffcrcnt^ct of computational 
step£ to be performed. Perhaps the most 'difficult aspect of defining 
£ the semantics of. a programming language lies in capturing the 
notion of the "computational context" in which each statement is • 
executed., " ' 

ym)rmaHiefinitions of the semahtics of programrping languages 
ynavc "been based on a broad range of approaches. Unfortunately/ 
' space constraints preclude ctfen an example definition of a simple 
JangUagc. A brief sketch of three different approaches must suffice' 
to convey the flavor of somct^the mw promising approaches. A 
final critique suggests * sorife of ^hfc difficulties with these ap- 
proaches? None has been accepted as a method for practical defini- 
tion of any large class of actual programnfung languages. Informal; 
^ methods remain the standard for new language definitions. ' • ■ 

f : The "Abstract Machine" Approach,* The most straightforward 
approach to the formal definition of programming languages is 
based on the notion, of ah abstract machine (sometimes termed a : 



ERIC 



ERIC 



176 - T err ende W. Pratt 

virtual computer or exeQUting automaton): An abstract iqachine con- 

sistsfbf: - ■• ..v,V- ■ "\. ' ; " " V 7' 

.'.,,'"..•*.•*.' « ■* / . * • ■ 

1. a set of states, eacti of which is a complex formal structure s 
containing represent anions of programs and datay 

2. a set of primitive operations on states (state transConrfations) 
that map &ne state info^ "next" state; / 

3: a transition rule that qiaps any state into a primitive oper- 
; ation, representing Jhe next operation to be/applied, or into 
/ the halt operation; and * " / 

v 4/ a subset of states that serve as initial states./ s \ 

' : % ' • N v : ■■"./■ Si " : 

Tacomplete the definition of a programming language, an abstract 
machine must be augmented by V. r' |h ; j k 

'5. a fomal language, xonsik^ representing 
syntactical^ valid programs in the language, usually defined 
• by a formal grammar of some sort ; a'nd ; 1 
6. 3 translate valid progran^ into an 

^ initial state of the abstract machine. *N • 

The formal languagQ defines the,set of valid programs in the 
language. For each of these valid /programs the function that it 
computes is determined as follows / 

1; First the program is mapped into ,an initial state Of the ma- 
chine, by means of the translati^ function. . \ 

X F ( rprii this initial state the trarisitipn rule and primitive oper^ij 
ations of the abstract machine define a sequence oCstate transi- 
tions, first from the initfal state to a next state and then from that 
^tate to another following ^(ate, etc. Ultimately either the halt oper-; 
ation is produced, and thi/s th? sequence' ends, or the sequence may 
continue without halting. If the°sequence does end, then the filial 
Itatd in the sequence represents the result of program execution. 
The Initial state is assumed to contain ttfe jnput data ; the final state . 
then contains the output data. 

The Viehna Definition Language. The most widely known exam- 
ple of the abstract machine approach is the formal system known 
as the "Vienna/Definition Language" (YDL). This .approach was 
developed by Vienna Laboratory of IBM for use in the formal 
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•definitibn of the language V ^L/1, it has, been used in definitions of - . 
PL/l7ALGbli 60, and BASl(^ r amc«}g others v \ 
1 In a VDL definition "bf a programming language, each state of.* 
the abstract machine i§ represented by a finite tree with labeled 
args and nodes The ••state tr&" tljat exists at any point during a 
computation r sequen<$e contains complete informatip^ about the 
state of the. computan^rt^t that point: It contains all the data 
structures used by the program* the program text itself, various . 
"housekeeping' 1 data structured needed to keep track of data names 

' and/attributes, etc., &nd.a "coritrbr^btree tiiht contains the next ^ 
operation to- be executed a* any point together with afiiy * pending 
operations. All this^ information is represented in the form ' of 
subtrees within the overall state tree, [%._ ^ v 

The primitive operation^ in a VDL abstract machine are defined 
as tree transrormations that modify the state tree in appropriate 
ways. The transitioif rule is .straightforward: The next operation fo^ 
be applied, given a current state tree; is the 6peratiop<tnat 'labels a r 

' terminal node of the control subtree in this current state! Usually m 

> the control subtree has only a single terminal node, so. that the nextV * 
operation is jiniquely specified, but^it is possible for there to b^ 
multiple terminal nodes, in which case the abstract machine be x - ./ 
comes nondeterministie: More than one sequence of state tnansi- . *~ 
tions is possible depending on which of the^possible next oper- 
ations is chosen for applicatioH to She current state. 

The initial states' of the abstract machine pre defined as follows: / 
An initial state treg'has a number of invariant comporieXrts, pri- 
jnarijy representing efements 4f the state that are initially empty. 
The. major variant component is the program text subtree. This A a 
^ubtree contains the program that is to be executed, represented as 
an abstract syntax tree derived from the orignakprogram, The 
abstract syntax' tree of a program is essentially the parse tree of t)ie 
program with mo$t of the nonessential syntactic elements deleted* 
such^s punctuation, and a number of implicit specifications added, ~ 
such as default attributes for data stiaictures. v 

The^set of valid, programs in the language Is specified a 
context-free grammar, and the translation from valid program to 
thejeprrfesponding abstract syntax tree is defined by aKranslation \ 
function li that maps the parse trees defined by the cbnt^cUfree| 
grammar into the ^appropriate abstract syntax trees. \ ^ 

'4*- * - • ' i. ■ # '■' ' • ', ■• •• > . 
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The S^ott-Strachcy Approach, A second, formal approach to pro^; 

gramming language definition attempts to ^directly describe the v 
. meaiiing^of a program in terms of. the function it computes. ;T|ie \ 
.foundations^ of this approach are due largely to Dana Scott and * 

Christopher-'Strachey. Concisely sunjmarized, this approach, is 

i based -oh the .following constructs that differ from" those used in the 

abstraCt machine approach: , 

v ■ . \ »-.* ■■' . • ; ■ ' t , \ 

• L :The computatiop invoked by a particular program is, still de- 

scffbcd in ternjs ofo sequence of states, buf the states now contain 
• b'hjyytfie (lata being manipulated, not the program itself. * y 

''. 2/;States'We'2ohstructedl^oiri b^ise sets, functions ofi^hese base 
sets, higher-order .functions-over these functions, etc. A lattice struc- 
, t'ure is jmppsed on these sets, and the functions dejined must aljrbe 
"continuous" irt aiVappropriate sense on these domains.- ' • 
; • 3. Tte meaning* of thV va types in w the languajejis 

defined directly in terms of functions mapping states intb state*. 
Those definitions "typically will involve general recursive equations 
of the form f '^ f (f). A uniqu^solutibh for Such, systems of recursive 
equations is ^foyided by a theory of feast fixed pointy generalizing 
tW classical recursion results of Kleeup [12]. • 

-4. The -overall meaning oCa program is defined recursively, as 
tfie application of the function defined by the "first' statement in tljg 
program to the initial stale, followed , by the* application of/the 
remainder of the program toUhe resulting state* • i. ,.. 

5. because the meaniijg of a program is denned direct^ jn tewris 
of its csyntactic Structure, a^hg recursive composition oj the f&nc- / ' 
-Hons defined by each of the 'constituent statements of thfe program 
in the Written sequience, ttan^fers^ of cdntfol from one^ statement to 
.! anotlrer within thp progfam.(e.g^ by goto or exit statements) caiJse 
1 constderable difficulty. A* rather /comple^ fopmal structure called a 
-■ f "continuation"^ introdu£e&tO slirmounfrthis difficulty. • 

The resulting definition of a programming language in this for- 
malism is "a set of recursive equations over a set of syntactic apd 
seWhtic (domains (basp sets£ ^The set >bf valid programs in the , 
language's defined by a^sey of productions (equations) that essen- 
tially form k cont^xt-free«grammar for the language. , 

The|loare "Axiomatic? Approach, trvttys approach, the precise ' 
' definition of any sort of detailed "state'Vof a computation is avoid- 
ed entirely. The meaning 9f a statehient in th^^nguage iVdefjned 
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i^ tefms; of axioms an^ rules of inference, which ajlow one to 
describe properties of th6 computational state before (6rvaftei:) e^ . : v ' 
cution of a ^giyen statement, given .known properties -of the state 
after (or before) execution of the statement; - : , • ■ N / ■ 

Twq examples N will illustrate the technique: 1 ' ' lj " ' ' % ■ 

1. Assignment dxiom: Assume'a simple language With assignment : 
statements- of the fojpm 'v- 

whose intended meaning is that the current values of the variables 
X 9 y, ... referenced in the right-hand sid^ are taken as arguments 
for the function F, whose value is then assigned as'the new valifc of 
the variable Y. An axiom that formalizes this meaning would be: 

- {p{X,F(XJ,.-- ), ...)}Y:% F(X,Y, - 

which is read: If p is a predicate (propositional formula) specifying**^ 
relationships between the values o/ the variables in the .current 
state, and if p(X\Y,.\) is true after execution of the assignment *' . 
statement, Y: = F(X^Y y ...), then we may deduce that " : 
p(X,F(XT, . ..), .. .) was a irue statement about the relationships 
between the values of the variables before Sxecutiori of thS assign- > 
n^eht statement. ^ 

?. ' fMe." statement inference rule. Assume a simple language 
with while statements of the form: 




while BdoS 

where B is a predicate as above and S is a list of statements. The 
intended meaning is that the - predicate B is to be evaluated using 
the values of the variables in the current state, and if its value is 
true, then the statement, list S is tot be executed to produce a new 
state, and the entire process repeated until the value of B in the 
current state is falser The meaning of this statement would be for- 
mally defined by the rule: 

{p} while *d^{^A-iB} ! - ; • s 

which is read: If we can establish, for predicates p and B, that 
whenever both p and B are true^efore execution^of the statement 
list S then p is true after execution as well, then we jnay conclude 
that whenever p alone? is true before execution of the while state- 

S ' . - • ■ 1B6 x 
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merit, ^yKilc rfo 5r thtit after execution both p ian^ n '0 must be 
:•>;.*:, .'true. Predicate p is termed the J /oop ifwariani^^\;\- 

/ " Sumimi|y of the Problem of Programming Language Definition. 

/ \tfone of me techniques for form^U definition of programming 
° languages /has been entirely successful. Various reasons may be \ 

Advanced as to why this is so. On the one hand, existing program^ 
«. ; filing language^ are, almoitf Without exception, far too irregular 
a • " J semaritically to admit simple definitions. This' irregularity is^rgely V 
' a result of the ad hoc method of their original definition. Thviis^ 

are in s&me sense intrinsically difficult to describe fonnalj^. .Equally 
-v--^ 0 ^ 3111 ' n Contributing to the lack of Recess of formal defini-v 
■|f£. tionartecfiniques*has been deep disiagreement as to the criteria for 
*'< W\ . evaluating these, techniques. Each of the approaches mentioned 
• ^ fairly successful.according to sortie desiderata and 

i has failed accordingno others. ; „ . '. 

r "\f >A brief summary of some of the objections will suffice to illus- 
f /trat^the difficulties, without really doing justice to any of them: 

1. The abstract machine approach introduces' a corilplex extra 
*layer of definition^ One cannot directly understand the meaning of 

_ : . a particuldr statement in a program, instead, the statement must 
first be traced through a complex translation into a sequence of 
operations for the abstract'machine,- and then the effect of these 
. operations must be observed; In ajdjiition, the abstract, machine 
approach seemsito "overspecify" the l^guagfe s^haqtics by\speci- 
\ * Tying .details about exactly h0\^ partic^r constructs are;tb be 
handled,* even though* this specification is irrelevant to the compu- 
tation qkthe output data. . ^ 

2. The^Scott-Strachey approach avoids much of the detail of the 
state structure, The meaning of a statement is defined directly in 

- terms of its effect on the current states thefe is no translation step. 
Howeve;, the definitions produced are complex and rather obscure,, 
due in part to fundamental problems with description of common 
programming language constructs, particularly ^ansfers of control 
{ and changes in thfc meaning of varjabfe names. / . 

3. The Hoare axiomatic^prpach also provides direct semantic ' 
definitions for statements aiid^fiier syntactiexons^ucts but suffers 

, 1 even more seyerely from dlfficultips with description of some 
common* ^ogramming language constructs, particularly transfer^ 

. v .... ° ' . • *:;.v: < • . ■ ■ ~ •■ 



ERIC 



FORMAL ANALYSIS OF COMPUTER' PROGRAMS ' / 181 

of controj, the meaning of variable names, and data structures with 
shared storage^ a. - ' •••'.•.v. -J.'... /j'm- , \ 

V A • . :i • . ■ ■ ■/?.■ ; ■ I i.; 

4. THE FUNCTION REPRESENTED BY A PROGRAM 1 j 

** . " •,.- , f- ' : 1 . , ' ":V. ' ■ ... V* i 

• • ... <r ■ • ■ • ' ' ■ ■ ' ■ '/ i i "' • . 

: Any pr6gram qefinds a napping from a family of input data! sets 

into a Camily of ojutpjit data sets^and thus any program defines a 
function (o^cpyr^G the rang^ may be the empty set if the program 
never 'termiriatef fdr ; jiny input). It is natural to ask, given a pro- 
gram, W^at\ function idoes it 'compute? Assume now that the pro- 
gramming lahguaige has been carefully dqfineld so that we jnay 
ign^e^nyvqii^iiqp^of ambiguity in the notation. If We knew what 
function the vpro^ra^did^ fact compute (or represent), tlien we 
^cbuIdv'dbtenAiq the program, i.e., whether it 

Cojrnjpute<k}^ie function we desired. The neXt secticm Hakes up this 
prbblem, the Dfoblem of proving a program correct. However, it is 
^worthwtiild to^prdpe thp facile observation that any program rep- 
resents « function^efote m6ving to the correctness problem. ; 
-\Therd/are a «hmber of characteristics of the functions rep- 
resented/by programs that distinguish them from most of the func- 
tions .ordinarily'- seen in elementary mathematics.! First, the func- 
tions cbniputed are defined, almost without exception, in terms of 
£ases//The family of input data sets is partitioned by the program 
into a set of disjoint cases, and for each case tfie'program computes 
" the output data, set Jn a different way. Of course thfc definition of a 
• function iq terms Of different cases is familiar from mathematics. 
What is noteworthy is the number of cases -treated: Even a rel- 
atively simple progriim with jio loops is likely to consider hundreds 
ofoiiilferent cases. If^the program contains a loop, then the number 
of cases treated is potentially unbounded. 

Given a program, it is straightforward* to determine the cases 
that the "program discriminates and the computation invoked in 
each case. Eaph different execution path, through the*program cor- 
^ responds to one? case. Along each path, parj of the computation is 
concerned with choosing tlfe path itself (ve., with computing values 
thafcare used at the next biranch point to determine the path along 
wfefch execution should contihue), and thus this part of the compu- 
tation actually discriminates the case into which the input data 
' fallsJ This part of the program is usually Armed- the control strue- 
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-riiwjpf the program. The remainder of the ttomputafion along the: 
/path is concerned wiih computing the appropriate Output data. If 
! one writes ddwn for each path the control part of cach path and 
the associated computation of thp output, then a description of the 
'functiotico^ 

• The division ptogrjim of Figure- i provides a/ good example. 
Figure 2 shows a partial description of the c^ses treated by this 

. program and the output data computed in -each case,* Singe this 
program contains a loop, the number df cases treated is potentially 
infinite, i.e.,, there are an infinite number oWiffercnt paths through 

. the program, y • t 



tit 



sc 



Output Computed 



E > 1 

l/4<E£l#AP;>(3/4)Q 
1/4 <E£\/2/\Q/2&P < 
1/4 < fc<; 1/2 P < Q/2 . 

l/4<£^ l/2AJ^P<Q/4 

1/8 < E £ Ui^P Z (7/8)Q 
1V8 < EX l / 4 A ( 3 / 4 )Q ^ p < ( ? / 8 )2 



K;«0 



7/8 



K:=0 + 1/2= 1/2 ■ V ' ; 
K: = 0 y 

K: = 0+ l/2 + (l/2)/2- 3/4 
R: = 0+ 1/2* 1/2 
Rj^O + ll/^^ 1/4 
K: = 0 

R:*0+l/2 + (l/2)/2+((l/2)/2)/2« 
R;~0f 1/2 + (l/2)/2 » 3/4 



Fig. 2. The function computed by the program of Figure 1. 



; The assumption that the division program may treat an un- 
bounded, number of cases is not realistic, of course. We do not 
expeefcthe value of £, the acduracy of the quotient, to be arbitrarily 
small on any real computer. But even if we allow E to be no- 
% smaller than' 2 T 3 .? (certainly not unrealistic' on many computers), 
. % the number of cases treated by the program, although now finite, is 
* so large as to effectively preclude any p^se-by-case description or 
^analysis.* . v 

A second source of cQmplexity in describing the functions corri- 
» puted by programs lies in the complexity of the control structure 
computations that, discriminate the cases treated by the program. 
These case discriminations almost always,inyolve complex deterrrii- 
. nations about the, sequence in which tHe' input data is presented. 
Thus th/various pieces of input data are not treated independent- 
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ly, but, father, the clppcar&hcc of a ccrtain*piccc of input data early 
in the sequence causes a'changc in the way a later piece of data is 
interpreted. Typically a substantial part of'thc control structure of 
a program serves to save the "context" crcatccj by early data in the 
input sequence so that it may be vsed to control the interpretation 
of th$ later input data. Thus, if we wish to describe the function 
computed by § program in terms of the cases, treated, then th<j 
description of the cases must include these sequential* interdepen- 
dences in the input data. I \ ■ V < 

Yet a third source of complexity lies jin the presence of paths 
through a program thaj. can never be traverse^ for any choice of 
'input data.^Thcrc arc no Such paths in the division program of 
Figure ! because for any possible execution path* there is in fact 
some choice of input data 'that' will cause execution to take that 
^path (ignoring the problem of a minimum value for E). However, 
few programs of "even moderate size have this property. Consider 
the program of Figure 3, known as Jfhe 91-funct\(fn. T$je definition 
of this function is: • • v - / 

For integer input value X t the-value of the function is 
•.' / A*- 10 if'Ar > 100; and 91 otherwise. 

program Example (inpuu output); 

var AMnteger; . ■ 

function F91{ X integer) integer ; ' ' 

vaM,B:integeri '( 
begin A: = X; 
B:= 1; / 

while (/I £ 100) or (B# 1) do 

If A Z 100 then begin A : - A + 11; . 

c B: = B -f 1 end 

■ t 
else begin A\ =* A -10; 

* 

v B:»B-1 end; 

/ 

F91: - A — 10 

end; 

begin read(X); ' * - 

\yrite(X>F91(x)) 

cnd . «. • . • . % 

Fig. 3. The 91-function. ' 

. * 20 0 
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'^Notc that the program contains many paths that can never be 
traversed for any cluuccof jnpitf' value. Moreover the control 
structure of the program effectively conceals the simplicity of the 
function computed.. It is difficult to determine that the description 
V ' m above is in fact an accurlta description of the function computed 
\ from inspection of the program itself. * ; :. . ; 

\ Let us now return to the original problem: Given a program, can 
,\ we describe the function that it computes?. The answer is that, 
\ while we cah do so, the description is likely to be almost as com-* 
\ plcx as the program itself. Even if. the function has 9 simple defini- 
tion, as with the 91-function above, the program - structure may 
Effectively hide the fact. But far more likely the program will treat 
such a large number of ca^cs, interrelated in such complex ways,, 
that no simple definition is possible. In fact, in many cases) ,thc 
program itself may be essentially the simplest way we can think of 
to describe the function precisely. Thus, r although a program 
Si w{iys\ computes a function, we cannot presume in general that 
such a function admits of a definition that is substantially simpler 
i than the program itself, 

' > K \ ' ; 

5. proving Program correctness 

Oncq a program has been written, the next problem is whether it 
is "correct," i.eAwhether the program performs the intended com- 
putation when i\ is executed. The traditional approach has been 
almost entirely ad\lwc; the program is' simply tested by being exe- 
cuted with spme 'example data sets, and the resulting output data is 
inspected. If the output data is that desired for the given input data 
in all the test cases, tWcn the program is presumed correc^ 

The difficulties ;with\progtfam testing as a means of determining 
correctness are easily seen, given our discussion of the preceding 
section. EaclY execution X path through the program determines a 
separate computation. In principle each of the possible paths v would 
need to be'tested with eachrof the possible input dat* sets before 
one could be sure* that the^program was correct. Of course the 
number of cases involved in even* simple program i$ usually far 
too large to allow such exhaustive testing. 
Program testing, then, is an inadequate means of showing'that a 



<, 201 



ERIC 



t \ 1 I OK MA I. ANALYSIS OI' ( COMPU111R PROGRAMS 185 

program is correct. N&t only is testing inadequate in theory, but 
also in practice: most computer programs beyond the very smallest 
contairt errors, even pr9grams (hat have been thoroughly tested 
and ustfd for some years. These latent errors arc an enormous 
practical problem, and better means of assuring the correctness of 
programs have bccn-cagcrly sought.' * 
' How can one* formalize the notiort of a program being "correct"? 
Informally, the program is correct if it computes the desired func- 
tion, i.e., if, for each of thb desired range of input data sets, the 
program produces the desired output data set When executed. We 
need, then, a precise definition of the function that the program is 
to compute! 1 Given such a function definition, and given that the 
meaning of the programming language has been precisely defined 
so that the function that the. program computes is also known, then 
the program is correct if the function that it computes is in fact the 
given function. 

Although this concept of program correctness is straightforward, 
one is immediately mired in difficulties in attempting to apply the 
concept to actual programs written in real programming languages, 
Both the problems of the preceding sections come to the fore: First, 
wc usually do not have a precise, complete, and intelligible defini- 
tion of the programming language; second, even if we did,' the 
function computed by the program is likely to be extremely com- 
plex to specify. If the specification of tbp function desired is itself 
almost as complex as the program in question, how can we be sure 
that the specification is correct? Moreover, if we are to specify the 
function desired, we must, have a precise notation for making this 
^specification. This notation must itself be defined. ' 

The general problem may be restated: We hav£~two notations 
for defining functions, a specification language and a programming 
language. Given a specification written in the 1 specification language 
and a program written in the programming language, the problem 
Vtb determine whether bo^h define the same* function. If so, we . 
may say that the program is "correct," i.e.; that it is a* correct 
encoding of tl>e specification. Alternatively, we might say that the 
specification is a "correct" description of what the program com- 
putes. Of course both program and specification may be incorrect 
in the intuitive sense, in that neither actually represents the intent 
of.the person writing them. 
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^Formi) Correctness Proofs of ProfcrttiiiH, A number of ap- 
proaches have been studied to the problem of proving programs : 
correct. The most .widely known is the induct ive assertions ap^f 
proach developed first by Floyd [7], References to other methods ' 
are given in the final section below. 

In the inductive assertions* approach, the , usual specification . 
language used is the predicate calculus and ordinary mathematical 
notation. One gives an input predicate defining the doniain pf the 
function represented by the progrum to be proved correct and an 
output predicate defining the relationship between tjic input data 
values and the output data valucsri.c., defining the function com- 
puted! In addition a loop invariant must be provided for each loop 
in the program. A loop invariant is a predicate that is satisfied 
whenever the loop is entered from outside during program execu- 
tion and that is also satisfied after each subsequent traversal of the 
loop (i.e., after each execution of the statements within the loop). 
From these- predicates and a precise definition of the meaning of 
the statements in the hinguagc, it is possible to prove that the 
program is jn,fact correct, i.e., that whenever the input data satisfies 
the input predicate, the output data computed by the program yvill 
satisfy the output predicate. 

As an example of the technique, consider the division algorithm 
of Figure 1. Since the main program simply calls the function Quot 
and, prints (he result, we should concentrate attehtion on function 
Quot. iCthcHthrce input values to Quot arc P y Q, and, E, then 'an 
appropriate input predicate would bp:: 

(0^P<Q)A(£>0) 

and the corresponding output predicate! wc>uld be: 

\\ P/Q^k^P/Q 

where R is. the result computed by Quot. These two predicates 
define the function that we wish the program to compute. 

The nctft step requires that we provide a loop invariant suitable 
for use in>fhe proof. This is the most' difllcult and creative step, for 
the appropriate loop invariant) is oftelf hard to fiiid. An appropriate 
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V 

loop invariant, a predicate that will bo true whenever execution ' 
reaches the test E £ J) at the beginning of the while loop is: \ , 

% A «■ Q x Y »• • 

« **Q x (0/2) ■ • 

D » 2 k for some integer 2: 0 , ) 

. * 1 P/Q ~ D < Y*£ P/Q. 

The input and output predicates serve! to specify the function 
that 1hc program is to compute. To prove the. program correct, i,c., 
to prove that the program docs in fact compute this function, wc 
proceed by proving three lemmas, or ^Verification conditions": 

Lfmma I. If the input data satisfies the input predicate, and wc 
begin to execute the program, then when We first reach thb while 
loop, the loop invariant is satisfied. ' 

Lhmma 2. If the loop invariant is satisfied and the condition for' 
executing the loop is ftlso satisfied by the current values of the 
program variablcs/thcn after execution of the statements within the ; 
loop the loop invariant is again satisfied (for cither of the two paths N»- 
through the loop). 0 a 

Lemma 3, If the loop invariant is satisfied but the condition for 
executing the loop~isi not satisfied by the current values of the 
program variables, then after executing the statements following 
the loop through the final statement in Quot the output value 
satisfies the output predicate. \ 

To formally state the lemmas requires a formal definition of the 
programming language, so that wc have a precise conception of the 
meaning of "execution" of the statements in the program. Of 
c6urse, wc lack such a definition of PASCAL, the language use^to 
write the program of Fjgurc 1. However, since the program is very 
simple, it will suffice for this example to assume that the meanings 
of assignment. statements and while statements are given by the 
axioms stated in Section 3. The lemmas to be.proved may then be 
written: ^ . • * * 

Lemma 1. If the three values read by the Read operation and 
transmitted as parameters to Quot arc the numbers p, q f and e, and 

i ... 1 ■ . . \ . 
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.' ■'■ ■ • ' * •. ■ j ■ *. •. * ■ • ■ ■ ■• - ■ . ■■ *' ■•'-*'' 

(O S p < q) A(« > 0) then (taking the loop invariant and, substitute 
irig the values assigned -to each of the variables in the preceding^ 
assignments in the pro gram of Figure 1): 

>,.:• 3/2=^11/2) , , - ;/•••-.< : - 

••.•v. : ' ; : - 1 = 2-*forfc = 0-.- x 

Lemma 2. Irat the beginning of the while loop^we know that: ..~ 

• »: . .■■ " •■ • ^ '■ : . . ' ■ 

E^D ••''.•j 'v 

/I = Q x 7 • 

. b = q x (d/2) : v ■ : .f 

b = 2 -k for some integer /c 2 0 
and ' 

/ P/Q-D<Y £P/Q 

then it is also true that (again substituting new values assigned 
during execution of the loop): ' ! • 

if P 2: /I + B then A.+ B = Qx(7 + D/2) :' " ' " : : 

v- v . B/2 = Q x ((D/2)/2) ' 

D/2 = 2 ~* for some integer k 2 0 
P/Q - D/2 < 7 + D/2 <. P/Q : ; . . 
or if P < /I +• B then /4 = g x y- , ." 

B/2 = Q x ((D/2)/2) 
D/2 = 2"* for some integer fc 2 0 

' P/Q - D/2 < y <; p/e: 

From this lemma we can conclude that if the loop invariant is 
satisfied by one. set of values for P, Q, E, A, B, D, Y, and the loop is 
executed, then the new values also satisfy the loop invariant. , 

LemMa 3. If at the beginning of the while loop we know that /„ 



E>D 
A-Q x Y 
B=Q x (D/2) 

D = 2~ k for some integer k 2 0/ 
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?•-•'.>:. " P/Q-D<Y<zP/Q 

then . . a .• • . . .' «• 

•^\V-' V ; ; ; v: p/q -e <y <; p/q. ;,' . .'■/;■ ./V / 

Proving these three lemmas would suffice to show that the o pro- 
gram computes the desired results for inputs in the specified 
domain, provided that execution ever terminates at all. To com- 
plete the "proof of the program 4 " we must also show that program 
execution always terminates for any input date satisfying the input 
predicate. Such a proof of termination is usually given separately. 
For the example program, execution can fail to terminate only if 
the predicate in the while statement were to be satisfied on entry to 
the loop and always thereafter, for some set of valid input data, no 
matter how thany times the loop was executed. A simple argument 
suffices to show such a situation impossible for this program. How- 
ever, in general, proof of termination is hontrivial. This completes 
the proof of correctness of the division program. As an .interesting 
exercise the reader might erijoy, "proving" the 91-function (Figure 
3) using the specification given in the accompanying text. 

The conclusion provides references to further, more complete, 
discussions of the formal theory of correctness proofs for programs. 
Note how closely the problem of proving correctness is tied to the 
problems discussed in the two preceding sections: We must have 
-both a good useable formal definition of the programming 
language and a way of specifying the function the program should 
compute in order to state and prove the conditions for the 3 program 
to be correct. : *v>. V ? .'" '■ "•>-■ . 

• 6. EQUIVALENCE AND TRANSFORMATION OF PROGRAMS 

The fourth and final problem of this discussion is also one that 
has generated a number of formal studies*4he problem of opti- 
mizing or improving a program 7 through transformation of its in- 
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ternal structure. The need for such program jtransformation arises 
in many applications. The .most common ca^e is probably the need 
to improve the^ efficiency of a program, either through reduction ok 
its storage requirements or through f eduction of its execution time. 
Another set of useful transformations in^prove the' internal struc- 
ture of a program to make it more intelligible. 7* 

The formal problem is easily' stated. Given a programming 
lahguage L, consider all the programs that may be written in L^the 
*set R L . There is a natural equivalence relation definable on this set: 

Dehnition : If P and Q are program! in. ; then P^Q iff P 
and Q represent the same function, i.e., iff P and Q are defined over 
the same domain of input data sets and for each data set in this 
domain, they produce identical output data sets. • < 

Each of the equivalence classes will, in general, contain many pro- 
grams, since there^will be many diffefent ways to compute the same 
function. « < - 

A transformation on programs in R L is a mapping from R L into 
R L . A transformation, T, is valid if for any program P, P .ess T(P), 
i.e., if .the original and transformed programs both compute the 
same, function. Ordinarily the transformations of greatest interest 
are the valid transformations that also produce an improved pro- 
gram according ttf some measure of program structure, or per- 
formance. ' > 

A general method for determining the equivalence of two pro- 
grams in the set R L (for any given language L) would be a gre^it 
help. That^s, we would like to have a rule for deciding, given any 
two programs P and Q in R L , whether P a Q. Unfortunately a 
general solution to this equivalence problem is not possible: The 
question is undecidable for any programming language L b^ond 
the most trivial (see, e.g., Constable and Muchnick [3]). 

Optimizing Transformations, Fortunately it is not necessary to 
have a general solution to the program equivalence problem in 
order to -develop valid transformations on programs. The largest 
class -of useful transformations is the class of "optimizing" trans- 
formations, transformations that decrease the execution time or 
storage requirements of a program! A few examples will suffice to 
illustrate the sorts of transformations involved. 
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.1. Moving constant computations <At of loops. Consider the pro- 
gram segment : > 9 f v . v 



read{Z) 



whUeJf >T'dd, • ' 

begin t 

■ ■ . • \ " 

Y: = sin(X) + square-roat{Z)/5 




Assuming that Z is not assigned a new value elsewhere within the 
loop, then thejcomputation of " 

^ square-root(Z)/5 

is repeated on each execution of the loop, and yet its value is 
always the same. If, as is often the case in large programs, the loop 
is executed many thousands or millions of times, the repetition of 
this "constant" computation may use a substantial amount of exe- 
cution time. Such a computation may be moved outside of the loop 
and computed only once, thus decreasing the execution time of the 
program. An equivalent optimized program segment would be: 



read(Z) 



T: = square root(Z)/5 
while X # T do / 



f : ■ 

■: . •'••> % -.• • • • 

±92\ > . Terrene* W. Pratt ^ 

s ' ' • * - ■ * * v : 1 ' - M - '+ 

^ ; begin . v 



f 1 



7: = sin(X) + T. 



A fopiial proof that the transformed program is functionally equiv- 
alent to the original program reguirBs a formal definition of the 
programming language. Even informally there are a number of 
subtleties that muat be considered before one 'may determine exact- 
ly the conditions under wn|ch the transformation above is valid. 
Mpst notably, the functioff square-root may change* the values of 
variables in k the program indirectly, through so-called "side effects". 
In such a case, even though the value returned by the function is 
constant, the transformation may npt produce a functionally equiv- 
alent program. 

v 2. Eliminating computation of repeated subexpressions. Consider 
the program segment: . i. * . 



A\2*l + J]: = jR + B[K,1*1 ."•+ J] 



R: = (2*I+J)/W 



* The expression 2*1 + iJ appears in three places in tlj& segment If 
this segment is within a loop which Is executed many thousands of- 
times, then much execution time would be saved if the ralue of 
2*1 + J could be computed only once each time through The loop 
and this value saved and used in each of the three occurrences / 
instead of being recomputed at each occurrence. The more efficient/ 
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pv^gvam would then til:', v ' > * 

T£MP: = 2*/ + J 
v A\^EMF\ : = 1? + BlKiTENfP] 

R:±TEMP/W «, 



Under what conditions is this transformation valid 5 ? Again a pre- 
cise definition of the meaning of the programming language is 
needed. It must be proved that the value 'oPffte expression is the 
same at each occurrence when -computed as in the origirial pro- 
gram! Obviously the values of variable / ^nd J must be known, to 
be the same when execution reaches each of the three occurrences 
of the expression. Thus determining the validity of application of 
this transformation requires tracing the Aqw of the computation 
and the possible changes in values of the variables that may occur. * 
Many other optimizing transformations have been studied. Aho 
and Ullman [1] provide a survey of these results. 

Transformations That ImjfiW Program Structure. A second 
ciass«of transformations of intAest are those that make the struc- 
ture of a program more intelligible. Much of this work has been 
motivated' by the problem of changing "unstructured" programs, 
containing»many statement labels and goto statements transferring 
control to these labels, into Structured" programs that have no 
labels and goto statements. The formal foundations for such trans- 
formations are Well established ; Some of the earliest work was that 
of Bohm and Jacopini [2j, who were able to shpw that for every 
program, in a simple programming Iknguage there was a function- 
ally equivalent program that required only _sim'pleLstatement_se-_ 
quences and while^statement loops. In particular goto statements 
ahd statement labels^were* not necessary. Later work. has clarified 
the situations in which such transformations are possible without 
the introduction of new variables and conditional branching. 
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Kputh's survey provides a thorough introduction to this 

work.'- • % N ■ , , • ■ j ' ' '. % ■■■ 

♦ 7. suggestions"for further reading • •:.!, ' 

•. - . . ■:[,■! *• . 

Many methods for. the formal definition of programming 
languages have been c(eveloped. The papers in Engeler'[6] and • 
Rustin{21], particularly the survey by Wegner [24], are jajuseful 
starting point. See also the earlier survey by DeBakke|5[4].vAu° 
integrated treatment of various formal aj^proache^ is fpUnd in a 
paper by-Hoare and Lauer [9]. The Vienna Definition Language is 
described in Wegner [23] and Lee [16]. The original Work, by the 
IBM Vienna Laboratory is sun^iiarized in "Lucas aiBjtyalk [17]. 
Other approaches that utijize abstract machines are^found in -the 
directed graph models of the author [20] and Landi$s,wor,k based 
on the lambda calculus [14]. Teniient [22] provides J[ sorvey of the 

v v Spott-Strachey approach; see Milne and Strachey [19^1 for a more 
complete discussion. The axiomatic approach tp lingi^y|exiefii)i- 
tion is described by Hpare in a series of papers [8]^ti^t^^^^\ 
Much of the work on proving correctness of programs^ is based 

. on the inductive assertions method suggested by Floyd [7] /Manna 
[18]-provides a good.survey of this and other approaches, together 
wjjira further bibliography! The survey by Elspas et al. [5] is also a 
usefuj. starting point. ; v 

° Optimization transforations for programs are surveyed in the 
second of the two volumes by Aho and Ullman [l]. 6 L!edgard [15] 
provides^an introduction to "structure-improving" 4ransformations 
X)ri programs. Knuth's more jnformal survey [13] is also useful, v 

All of these topics are the focus for much current work aimed at 
providing solid formal mathematical foundations for improving the 
practice of computing. The newness of the field and the complexity < 
and ad hoc* nature of much of the current practice make such 
mathematics* both* a considerable intellectual challenge and an ex- 
citing endeavor in which the results may have jmmediate and strik- 
ing impact ° > •■ ^ 

REFERENCES 

L A. Aho and J. Ullman, The Theory of Parsing, Translation, and Compiling, 
. Prentice-Hall, Englewood Cliffs, N J., 1973. 



/ > * '^^^Ri^if ^Analysis of computer programs 195 
■ ' ' , . >! ■ * / . ...... 

2. c Bohm and Gl^jaicopini, "Flow diagrams, Turing machines and languages 
'" with only two^o^ations^ules," Comm. ACM, 9, no. 5 (May 1966), 366-371. 

3. R. Constable ^ ^ft-S. Muchnick, "Subrecursive program schemata," J. Comp. andL^ 

4. J.. ty. £$Bj$&er, "Semantics of programming languages," in J. Tou, ed., Ad- 
i vancesffylpfo.Sys. Sciences, vol. 2, Plenum Press, New York, 1969, pp. 173-527. 

5. Elsp^K. Levitt, R. Waldinger, and A. Waksman t "An assessment of tech- 
nw&ffit proving program correctness," Comp. Surveys r 4,no. 2 (June, 1972), 

9%W) '.' \- * : '- 

6. E^ngeler, ed., Symposium on Semantics oj Mlgorithmic Languages, Lecture 
yMp(&\T\ Mathematics; vol. 188, Springer-Verlag, New York, 1971, 

/j^WpXoid "Assigning meanings to programs " Proc. Amer. Math Soc. Symp. in 
y'i/'MppLMath, vol. 19, 1967, pp. 19-32. , . * / * 

'/;8m?G: A. R. Hoare, "Ah axiomatic approach to computer programming," Comm. 

PCM, 12, no. 10, (October 1969)° 576-580. . 
!. A. R. Hoar/ and P. Lauer, "Consistent and complementary formal theones 
r*' of the semantics of programming languages," Acta Informatica, 3 (1974), 135— 

?■ 153: ■ - ;/•■: . I-'./' ' - L 

10. C. A. R.. Hoare and N. Wirth, "An axiomatic definition of the programming 

• language PASCAL," Acta Info., 2, no. 4 ( J973), 335-355. 

11. K. Jensen and N. Wirth, PASCALrOser Manual and Report, .Lecture Notes in 
doiftputer Science; vol. 18, Springer-Verlag, Berlin, 1974. 

12. S: Kleene, Introduction to Metamathematics, Van Nostrand, New York, 1952. 
13/D. Knuth, "Structured programming with go to statements," Comp. Surveys, 6, 

/ no. 4 (December 1974), 261-301. V f - ^ 

,14. P. Landin, "A correspondence between Algol 60 and Church's lambda nota- 
V tion," Comm. ACM, 8, nos. 2 and 3 (February and March 1965), 89-101 and <■ 

• 158^165. , - ■ 

15. H. Ledgard and M. Marcotty, "A genealogy of control structures, , Comm. 

ACM, 18, no. 11 (November 1975), 629-639. 
.16. J.Lee, Computer Semantics, Van Nostrand Reinhold, New York, 1972. 

17. P. Lucajf and K. Walk, "On the formal description of PL/l, n . Ann Rev. in Auto. 
Pro0.,6, no. 3(1969), 105-182. 

18. 2. Manna; Mathematical-Theory of Computation, McGraw-Hill, New York, 
■ \ 1974. ' -J'":-: 

19. R. Milne and C. Strachtfy, A Theory of Programming Language Semantics, 
Wiley, New York, 1976. \ 

20. T. Pratt* "Application of formal grammars and automata to programming 
language definition," in r/t. Yeh, ed., Applied Computation Theory, Prentice- 

. Hall, Englewood Cliffs, NJ., 1976, pp.- 250-273. - 

21. R. Rustin, ed., Formal Semantics of Programming Languages, Prentice-Hali; En- 
glewood Cliffs, N. J:, 1972. * sJ 

22. R. Tennent, "The denotational semantics of programming languages, - Comm.— 

ACM, 19, no. 8 (Aug.. 1976), 437-453. 

23. P. Wegner, "The Vienna Definition Language," Comp. Surveys, 4, no. 1 (March 

1972) 5-63 * 

24. : "Programming language semantics," in R. Rustin, ed., [21], pp. 149-248. 

25. J. H. Wensley, "A class of non-analytic iterative processes," Computer J. ,1 
(1958)," 163-167. , 



21 



COM^ 

Franco P: PrepaMta 



1. INTRODUCTION: OBJECTIVES AND MODELS v , .- ; ; 

As rioted by several authors, computational complexity is one of 
the most rapidly developing areas of research in the theory of 
computation ; in sorpe enthusiast's words, it is at the heart of com- 
puter science. Naturally, Aht phrase computational complexity 
means different things to different people, and a rather sophis- 
ticated taxonomy has emerged in the literature, % both according to 
the research objectives and according to the specific problem areas. 
Although the discussion of a classification scheme of the various 
brands and facets of computational complexity is not the main 
theme of this arjticle, we shall now and then touch upon this aspect 
in or^erJp_:pLace in the appropriate context the notions to be 
presented.^ / ;7 ~ S 'ff 

Since^ancient 'times people have tried to devise procedures for the 
solution of^ specific "problems. Once a way to solve a problem is 



* This article, was originally written for this collection inH^ 6 - Only, minimal 
changes have been made to the original version, which reflects the state-of-the-art at 
the time of writing. (Note added in proof.) 
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. found, the, difficulty of the task of obtaining a solution to an. in- ^ 
stance of the problem is assessed' on the basjs of the efforts it 
involves, such as solver's ;tijne, use of specific instruments, etc. The 
desire to' give a precise quantification of the difficuUy of a pro- 
cedure/ or algorithm, became particularly felt with the advent* Of 
digital computers, leading to, a 'discipline called "computational 

^mplexity." As a» incidentalfefemark, the phrase "degrees.of diffi- ; 
culty" was the technical ^cursor of t^fortunate expression 

» "computational complexity* 7 introduced by Hartmanis and Stearns 
in 1965 [1]. - - 

Complexity is clearly a measure of effort, that is, of cost. Cost, on 
,the other hand, is precisely quantified in the context of the mpdel 

, which is selected for carrying out computation. The choice of the 
computing model essentially reflects the viewpoint and the flavor of 
the analysis, as we shall see below. In all models, however, com- 
plexity is measured by the usage of "resources" required by a speci- 
fic computation. These resources are, broadly speaking, time and 
space. Time is frequently expressed as the number of conventionally 
defined steps required by the computing device in order to com- 
plete the computation. Space may assume different connotations, 
such as. the number of required memory locations in a somewhat 
idealized compujter, the required length of the tape(s) of a Turing 
machine, or, as isfhe case in models of parallel computation, the 

^number^of individual processors simultaneously cooperating for the 
completion of a computing task. 

. The most abstract approach, known indeed as abstract compu- 
tational complexity [2], [3], aims at a theory of complexity which is 

' /nachine-independent. It concerns essentially the complexity of all- 
possible pomputations, that is, the computation of all partial recur- 
sive functions from the integers to the integers. Its results, which 
hold essentially for any measure -bf complexity and for any com- 
puting device, are fhathematically very sophisticated and some- 
times surprising,, but are not particularly, relevant to the applied^ 

^-worker who is seeking a solution of ai specific problenuon_a^con^ 
ventional computer. ■ • * 

All the other approaches constitute what is known as concrete 
computational complexity. Since we shall not deal withkthi^abstract 
approach, we shall hereafter omit for brevity the attribute "con- 
crete."in this brand of complexity, which is by far the one which 
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has attracted the greatest research interest, specific models of com- 
putation are selected for a definition of measures of complexity. 
Without analyzing the various interesting shades which have been 
considered, the two most* prominent models are the Turing ma- 
chine and a suitably idealized version of the" digital* computer 
(Random access machine (RAM), Random access storeci program 
machine (RASP), etc., discussed in [4, Chapter- 1]). ■ „ 

Theu advantages of" the Turing* machine model are the simplicity 
of its instruction set, the well-definedness of its computation "stej5," 
and the reasonable assumption that all steps have identical dura- 
tion; these facts greatly simplify the. assessment; for example, of the 
amount of time taken by a Turing machine to complete a given 
task. On the other Hand, real (or, realistically idealized) computers 
have a much jno re complex instruction set than a Turing machine, 
and, although simulation of any computer by a Turing machine is 
possible, very" complicated Turing machine programs! may be 
needed to simulate single. computer instructions), Thus analyses for 
the Turing machine may be' scarceily signified^ fonthje applied 
analyst, except for problems whose knowFse^tion-fflgeilthms re- 
quire such an inordinate amount of time that replacing a Turing 
machine with an actual computer does not change the order of 
magnitude of the complexity; we shall reconsider this class of pro- 
blems later in this section. ' } v 
' A great part of the currentfresults,;arid particularly those which 
are of interest to the computer practitioner, make reference to 
models whose repertoire of elementary operations resemble those 
of conventional computers; One immediate difficulty is that, hi s 
order to be as realistic as possible, each instruction should be 
assigned a specific execution time. One could, of course, follpw this 
approach and develop on paper, in a very detailed fashion, a 
"pedagogical computer," such as the well-known \MIX used 
throughout Knuth's work [5]; and there are instances, as we shall 
shortly see, in which such detailed analysis i^s the only reasonable 
approach. In most cases, however, one contents onftself with an 
analysis that gives as a measure of complexity the number of times 
some selected key operations are performed when running an algo- 
rithm, the justification being that the time actually required by the 
algorithm is proportional to that number. The choice of the key 
operations greatly varies from application to application. For ex- 
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ample, for sorting, merging, and pattern-matching algorithms, it is 
natural to use the number of two-element comparisons as a meas- 
ure of complexity; in numeric or algebraic applications, one consi- 
ders the numbers of arithmetic operations, such as addition, multi- 
plication, *md division; arid so on. Frequently, however, since the 
execution of a program consists of a concatenation of two types of 
processing — straight-line sequences And loops of instructions— and 
the bulk of the time is customarily attributable to thfHoops, one 
may ignore the time taken by strajght-line sequences and simply 
count the number of times loops are executed. 

A simple example may clarify this point. Suppose we Want to 
design an algorithm to test whether a given positive integer n is 
prime or nqt. For simplicity, assume that the word size of our 
computer is sufficiently large to contain the representation of n. A * 
naive solution involves testing tjje divisibility of n by any integer j 
such that 1 < j < n; n is a prime if and only if n is not divisible by 
any such j. This approach gives the following algorithm: 

1. If n = 1, 2, then n is prime; halt. 

2. Set j «- 2. 

3. Whitej < n do: 

f4. If fa is divisible by j, then n is composite; halt. 

' \5. Let j 4-j + 1. ^ 
6. n is prime ; halt. 

Clearly if n is Rrime, the loop consisting of steps 4 and 5 is executed 
(n - 2) times, vHiereas steps 1, 2, and 6 are executed exactly once. 
Notice that either step 4 or step 5 may involve several computer 
instructions, although a fixed sequence in both cases. Thus the total 
running time is proportional to the number of times the loop is 
executed A less trivial approach to this problem is based on the 
consideration that a composite integer n is the product of two 
integers ji • j 2 , the smaller of which cannot exceed the value jn. 
this leads to a variant of the preceding algorithm, where step 3 is 
replaced, by "while j 2 < n do." Notice that, in this formulation, 
wfyen n is prime the loop is executed approximately vn times, 
although the test "j 2 < n" involves squaring the index j and is 
therefore somewhat more complicated than the test "j <.n." We 
observe that, without referring to a specific computer, we are 
uiiftble to quantify upper-bounds to the times taken by |^ e two 
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algorithms for a specific n; we can say, however, that the variant 
runs faster than the original procedure, and, specifically, that the 
running times "grow like n and >/n," respectively. This^determi-" 
nation of a/'rate-of-growth" or "order of magnitude" of measures 
of complexity of an algorithm, ^bether time or space, as functipns 
of the input, is a central feature^ of computational complexity and 
deserves some additional attention.- 

A complexity measure ^or analgorithm is normally expressed by 
a, function (f>( )-of some significant indicators, which specify the 
"problem size" No ^general definition of pjoblem size is possibly, 
since the latter essentially depends upon specific characteristics of 
the mathematical objects constituting the problem input. For ex- 
ample, we have seen that, in a. test for primality of n, n itself is 
taken as the problem size; in graph problems^one may take the 
number v of vertices or the number e of edges of the graph, or the 
pair (v, e); in matrix problems, one usually takes the order of the- 
matrix, and so on. The only characterization of problem size which 
has a vague appearance of generality is the number of items in the 
input set (called the input size), since this choice is natural for a 
ylarge number of problems. In some cases, the input size is given by 
the number of bits, required to express the input set. With 'these 
precautions, we now assume that a single natural integer n specifies 

*the problem size. Then, -according to a recent proposal by Knuth 
[6], we say that some complexity measure <f>(n) of an algorithm is 
0(f(n)) (read "of order (at most) f(n)," for some function f( )}, if 

^here exist positive constants C and n 0 such that | (f>(n)\ <, C f(n)for\ 
n>n 0 .* The* "O" notation (also called "big-oh") refers to an 
upper-bound to the complexity measure; thus, when we want. to., 
refer to a lower-bound, we shall use the notation fi(f(n)) (read "of 
order at least f(n)") if there exist positive constants C and n 0 such 
that <f>{n) ^ C(f(n)) for n ^ n 0 . Obviously, these definitions apply 
when n-» oo and are therefore called "asymptotic measures." 

The consideration of the order of magnitude not only allows one 
to ignore the particular value of the constant of pro portionalit y 
between the! actual running time and the function f(n), but, more 



* According to current jargon, sometimes the phrase "an 0(f(n)) algorithm** is 
used with the meaning of "an algorithm whose complexity is 0(f(n))" *' 

: a . ■ ■ 



COMPUTATIONAL. COMPLEXITY 



201 



important, allows the comparative evaluation of the asymptotic 
performances of several algorithms devised for a given problem. 
The asymptotic behavior of algorithms is very significant because it 
determines the choice of the algorithm when the input size becomes 
very large. For rather small values of n, however, this approach 
must bp £ken with some discretion, since in such cases- it-is the 
actual running time and ritU its order of magnitude which deter-, 
mines^he choice of the algorithm. • 

' When discussing the algorithms for testing primality, we esti- 
mated the largest value of the junning time, which* occurs when the 
integej; n.is prime; however, shorter times occur when n is com- 
posite. Our simple analysis of those'algorithms obtained a "worst- 
case" measure of complexity. The performance of an algorithm can 
be-evaluated either on a worst-case input or on an average-case 
input, assuming some, distribution over the set of possible inputs. 
Both types of analyses are significant; a large majority of the; 
known results, however, are worst-case analyses both because sudr 
analyses are normally simpler and because, but for a few casei 
there is little agreement on the choice of the probabilistic model. ; V 

The preceding discussion merely sketches thfrmain features of aj/ 
important aspect of concrete computational comp&xity, known as 
analysis of algorithms. The performance of ^ algorithm which, 
solves a given problem implicitly provides anjBxSr bound to the 
performance, or complexity, of the set of algjiHK|S which may be 
devised for that problem. Closely relate^ |HBknalysi§ is the 
research effort known as design of algorithms, ^H^»)^ctive is the 
development of procedures for the solution of a^giyfn problem, so 
that their performance is provably superior to that of previously- 
known algorithms. A complementary aspect of computational com- 
plexity is the determination of lower-bounds to the performance of 
any possible algorithm for a given problem. Clearly, when the two 
bounds come close or, better, coincide, one has discovered an opti- 
mal algorithm, i.e., one has obtained a characterization of the in- 
herent difficulty of a problem. One must readily add, however, that 
this fortunate event occurs only for a small minority of the pro- 
blems considered and that researchers in this area are much more 
skillful in designing and analyzing algorithms for a specific problem 
than in proving their optimality. 

* Finally, to provide an appreciation of the great development of 
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this research area, we simply list the following facts, which hold at 
the time of this writing: four specific textbooks [4], [7], [8], [9] arc 
available and more ^are announced as forthcoming, not to mention! 
Knuth's monumental work [5], which can be' appropriately con- 
sidered as an encyclopedia of computational complexity; journals 
of theoretical computer science devpte increasing attention to com- 
putational complexity, and so do several prestigious symposia. In 
consideration of these extensive developments, even a survey of the 
field would be a project largely exceeding* the scope of this article. 
Therefore, in order to ppncretely illustrate-some of the approaches 
and techniques used in this field, we shall confine ourselves to, th^v 
•case-study format and describe in some detail some significant re- 
sults in several problem areas. 

None of the, case studies planned for discussion concerns the 
sp-called NP-complete problems, mainly because of the very exten^ 
sive and rapidly growing literature on this subject (see, for example, 
[4], [9] and the excellent textbook by M Garey and D. S: Johnson 
[23]). However* due to the central importance of'this topic in the 
theory of computational complexity, we canno.t clo r se this introducr 
tory section without mentioning some 6f its salient aspects. For - 
several years computer scientists and practitioners have been con- 
fronted with^very difficult problems, mainly arising in combi- 
hatoricsrppetations research, and graph theory, such as problems 
of scheduling, assignment, sequencing, etc. All these problems 
appear to require an inordinate amount of t.jme for their solution, 
specifically a number of computational steps exponential in the 
input size; More recently, through the combined efforts of S. A. 
Cook [10] (who ( pioneered the topic), of R. M. Karp [11], [12], 
and of several other workers, it was shown that almost all the 
classical combinatorial problems reputed to be -intractable are 
equivalent in the sense that, if one of them is solvable by a 
pdlyribmial-time-bounded algorithm, all of them are. The equival- 
ence is based on the fact that any problem in that class can be 
reformulated as any other problem in the class by means of a 
transformation which requires at most a polynomial time. Thus, 
since the transformation", albeit complexr can be carried out in 
polynomial time, it is ensured that the distinction between an .. 
exponential-time effort and a polynomial-time effort is preserved 
through the transformation. This justifies our earlier; remark; that 
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the Turing machine is an appropriate computation model for this 
class of problems. The term u NP-complete" is an abbreviation of 
"nondeterministic polynonjial-time complete " where, nondetermi- 
nistic polynomial-time means that there exists a nondeterministic 
Turing machine fdr'solving the problem (or, equivalently, a back- 
track search algorithm of polynomial-bounded depth for that 
problem), and complete, refers to the mentioned problem transfor- 
mability. The interested reader should consult the cited references 
to familiarize himself with this fascinating topic. X 

2. A COLLECTION OF CASE STUDIES J;\- \ * 

In this section we shall examine in some detail some $0$ 
feseritative problems from different areas. Although the comrfcron 
objective is the design of algorithms which are either optimal or 
whose complexity measures improve over previous results/ none- 
theless in each particular problem area conventions, measures of 
complexity/ and techniques have- a distinguishing flavor. Hopefully, 
the common features as well as the dissimilarities will emerge from 
the following presentation. 

2.1. Multiplication of Integers ' 1 

2.1.1. Generalities. The problem of multiplying two integers falls 
in the area of arithmetic or numeric computation. The two oper- 
ands are assumed to be given as two sequenced of n digits each, 
and, without loss of generality, we shall assume the digits are 
binary. Since n may be arbitrarily large, the operands cannot be 
/ stored in the memory of the computing device. Therefore the natu- 
ral model for this type of problem is a tape machine with a bound- 
" ed memory, which, for that matter/could be a Turing machine. The 
computational steps* to bV counted are conveniently chosen as opr 
erations with operands consisting of one bit each. This choice may 
' appear a little, artificial at first sight, since the arithmetic instruc- 
tions of any practical computer act an strings of several bits; how- 
ever, when the operand size is much larger than the computer word 
size, our simplification will only change thV computation time by a 
multiplicative constant. These elementary operations are usually 
' referred to as "bit operations," to contrast them against "arithmetic 
operations" which act on word size operands. 
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The simplest and best-known procedure for integer multipli- 
cation is the so-called "schoolboy method," which consists of the 
addition of appropriately shifted partial products; The number of 
bit operations is clearly 0(n 2 ) for operands of length n, since there 
are n partial products, each of which consists of n bits. This method 
can be speeded up by the following modification. Let 

. Asa nM a n 4„. a 0 and B = b n _! b n _ 2 ... b 0 

be the two operands and assume, for simplicity, that n is even. We 
split the sequence a n _i ... a 0 into two halves a n -i ... a n/2 s A t 
and a (n/ 2)-i ... a 0 = A 0 so that A = A^"' 2 + A 0 and do likewise 
for the operand B. Then we hayp 

AB-A 1 B 1 2 n +(A 1 B 0 + AqB^ 2 + A 0 B 0 . (1) 

This expression indicate^ that the multiplication of two n-bit oper- 
ands is reduced to four "multiplications of two (n/2)-bit operands, 
with no apparent computational advantage. Suppose now we com- 
pute the functions 

Ci = AiB i9 G 0 = A 0 B 0 , and C 2 = (A 0 + AiKB 0 ■+ Bi); 

next we note that A^Bq + A 0 B! = C 2 - Q - C 0 , i.e., the three 
terms of the expression (1) can be computed with three multipli- 
cations and four additions of (n/2)-bit operands. It is easily realized 
that addition can be done in a number of bit operations which is 
proportional to the operand length. It follows that, denoting by 
M(n) the number of operations or, briefly, the time to multiply two 
n-bit operands, we obtain a recurrence equation p 

M(n) - 3M(n/2) + 0(ri), 7 

which definesoM(n). The solution of this equation is obtained by 
standard methods and is found to be M(n) = Q(n ,OB 2 3 ), which 
shows a substantial improvement over /the naive schoolboy 
method. Incidentally, we have just seen an instance of the "divide- 
and-conquer" technique. This technique, which is widely used in 
the design of algorithms, consists in reducing the original problem - 
to a .collection of simpler problems. -\ 
^Before trying to push further the previous approach, let us take a 
critical look at what we did. If we replace the number 2 n/2 by an 
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indeterminate x in the expressions (A 1 2 n/2 ■+ A 0 ) and (B x 2 n/2 + B 0 ), 
we map the 'integers A and B to two polynomials (A x \ + A 0 ) and 
(BjX 4- B 0 ), respectively. Thus the terms AjBj, AjB 0 + A 0 B t , and 
A 0 B 0 are the coefficients of the polynomial C(x) = A(x)B(x). Each 
of these coefficients is an n-bit binary number, and, when 2 n/2 is 
substituted ba§]^ fori x, the relative alignment of their represen- 
tations Is as shown in Figure 1. Thus, the product AB is obtained 
by adding these three numbers as shown in Figure 1, and this 
operation is normally referred to as "releasing the carries " 

We have therefore transformed an integer multiplication oper- 
ation into the follow ing^eqyjence of operations: 

1. Multiplication of two polynomials. 

2. Release of the carries. / 

... . ;■■ ■. . ' _ ? 

As we shall later see, vthe carry release is a simple operation 
which can be done in tiirie proportional to the nunjber of bits of 
the result. Therefore, we shall -concentrate on polynomial multipli- 
cation./ ■ V... •■ ■ ' > ■ 

2:1.2. Multiplication of two polynomials (evaluation and inter- 
polation): A straightforward itiethod for polynomial multiplication 
involves distributing one polynomial into the other and collecting 
terms with identical powers of k. But there is a more subtle method. 
Suppose A(x) =X^=o A } x j and B(x) = BJx j ; then their prod- 
uct C(x) has degree 2p - 2, i.e^ it has (2p -l) coefficients. Select 
now (2p - 1) distinct real values x t , ...■ ,x 2p -u called "points," and 
evaluate A(x) and B(x) at each bf these points, thereby obtaining 
two sets of values {Afo)} arid {B(Xj)}. Then clearly, for any i in the 
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■ *-{.\ ■ n bits 

0 + 1 bits • . 
AiBx | \ 



■ n bits . ■ : " . v •'. . - 

Fig. L Relative alignment of A 0 B 0 : ^ A 0 Bi + A^o, and A^i, 
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range [1, 2p- 1], C(X|) m A(x,) • B(x,). Frqm the set of values 
{C(Xi)}\ we can now interpolate the polynomial C(x) of degree 
(2p — 2). Thus we have the following algorithm: 

Polynomial multiplication 

Input : t A(x) and B(x), both of degree (p - 1), and distinct real 

values x i9 ... , x 2p - v 
Output : C(x) -A(x) • B(x). 

Step If Evaluate {A(x,)} and {B(X!)}. ' . ""': >' . 
Step % For i ~ 1, . . . , 2p - 1, compute C(x,) = A(x,)B(xi). 
Step 3. Interpolate C(x) of degree (2p - 1) from {C(X|)}. 

Evaluation and interpolation are mutually inverse and can be given 
a very compact description. Notice in fact that . 



[Cq, .'..) C 2 2 ] 



1 ;■>, 1 ... 1 

Xj x 2 ... X Jp _.| 

Y 2p-2 T 2p-2 v 2 P~ 2 

X J X 2 ... X 2p -j 



[Qx,),...,^^,)]. 



Letting Ca[C 0 ,.,,C V2 ] and C(x)a[C( Xi U;, Qx^-J] 
and denoting by V the above matrix, we. have • r r\ 

CV = C(x) (evaluation), 

and, since V, a Vandermonde matrix, is nonsingular, ' ■ 

C = C(x)V" 1 r (interpolation). ■• 

We see that both evaluation and interpolation are equivalent to 
multiplying a vector by a matrix. It is now possible to develop an 
integer multiplication algorithm base^|^he /outlined polynomial 
multiplication method. We choose the integer p sufficiently large, 
select (2p - 1) integer values x x , ... , x 2p -i and construct two fixed 
(2p - 1) x (2p - 1) matrices V* and V" 1 . Next, we split the n-bit 
sequence representing the operand A into p segments : of n/p bits 
each; each of these segments can be viewed as the binary rep- 
resentation of an integer. Next we regard these integers as coef- 
ficients of a polynomial A(x) of degree (2p> 2) (notice that the 
(p — 1) higher degree coefficients of A(x) are 0). We do likewise for 
the other operand R. We now evaluate {A(X|)} and (B(x,)}. Notice 
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that multiplication of s-bit integers by a constant with a fixed - 
number of bits requires time proportional to s; it follows that the 
p(2p— 1) multiplications of n/p-bit integers by constants, as speci- 
fied by the evaluation step, collectively require time O(n), and the 1 
same can be said for interpolation. We also observe that the num- 
bers A(X|) and B(xj) are (n/p + k)-bit integers, where k is a constant 
depending upon the values x lf . , . , x 2p .i. It follows that the compu- 
tation of each C(Xi) is a multiplication of two (n/p 4- k)-bit integer^. 
With the usual meaning of the function M( ), we have 
M(n/p + k) = M(n/p) + 0(n/p), and we can establish the following ^ 
recurrence equation which defines M(n): 

M(n) = (2p ^l)M^ + p(n); , (2) . 

the solution of this equation is ^ »V— 

M(n) = O(n {0 ^ 2p " l) ) = .O(n 1 + 1 > l0g > p ). 

Thus by choosing p sufficiently large the integer multiplication time 
can be bounded from above by 0(n 1+e ) for any e> 0. Recurrence 
equation (2) has a somewhat undesirable feature: its right member 
consists of two terms, the first of which dominates , the other and 
determines the form of the solution. In similar cases one .seeks a 
modification bf the algorithm which tends to equalize the two 
terms; we recognize that the functional form, of M(n) is due to the 
fact that p is fixed, which, on the other hand, is exactly why we 
were able to estimate as O(n) the evaluation and interpolation 
times. Thus possible improvements can arise by making p a func- 
tion of n; this is one of the key ideas of the remarkable integer 
multiplication algorithm due . to "Schonhage and Strassen [13], 
based on the Discrete Fourier Transform. 

2.1.3; The Discrete Fourier .Transform and the FTT. Of course, if 
p is no longer fixed, the Vandermonde matrix V and its inverse 
Y" x cannot be precomputed as auxiliary devices, but are deter- 
mined by the operand sizes. Therefore, one must look for a pair 
(V, V 1 ) which can be -easily computed in each case.yjnder rather ~ 
general hypotheses (i.e., that we deal with a commutative ring S), a 
surprising solution is obtained by choosing x l = a) l "" 1 l for 
i = 1, q, where ^cb is a primitive root of unity of order • 
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q A 2p — 1 in S. With this choice V becomes 



V = 



1 
1 

I 



1 

CO 
Oi' 



1 

co 2 



art 



fl-l 



CO 



,2(q-D 



CO 



,(q-l)(q-l) 



and* is known as a q x q Fourier matrix. Notice that (V),j = 
co (l ~ 1)(J-1) . ^remarkable property, of this matrix is that 



& jo oi 



S"mod q 
otherwise. 



0, 



It follows that if we take the square of this matrix, the entry (V 2 ),j is 
given by 



(V 2 ),j =I(V) 1 ,(V), J =Ico<'- 1 >"- 1 > + «J- 1 >"- 1 > 



= ~V co«-"«i + '- 2 > = i q if0 +j_2)m 
,_,L 0 (0 otherwise. 



mod q = 0, 



We conclude that V 2 has the form 



V 2 = 

r 



10 
0 0 
0 0 



0 



0 
0 
0 

i 
o 



oo 

0 1 

1 0 

0 0 
0 0 



i.e., V is its own inverse except for a row and column permutation. 
But a most attractive /property of the Fourier matrix is the ease 
with which it can be /multiplied by a (2p - 1 ^dimensional vector, 
by resorting to a technique known as the Fast-Fourier-Transform 
(FFT), due to CooleyUnd Tukey [14], which We shall now outline. 

SupposerfoiLsimplicity, that q = 2r and a = [a 0 , a 2r j,]. We 
must compute siW\Jhe~Discrete Fourier Transform (D|T) oi "a. Let 
us compare the components C aV )j + i and (aV) J + 1+r . The former is 



given by the expression : 

' / 2r-l 



r-1 



+ 1 



£a^= Ia 2i co 2 "+ £a 2i+1 co 2,J+J 



1 = 0 



1 = 0 



1 = 0 



r-1 



(3) 



1 = 0 



1 = 0 
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Similarly, the component (aV) J + 1 + r is given by the expression: 



(aV) J + 1+r - £ a,a>« + '» - £ a 21 (a> 2 )» • (to 2 ') 1 

1-0 l»0 



r-1 



+ co J co r Xa 2l + 1 (co 2 ) l V r y 

1*0 , 



r-1 



l«0 



(4) 



since w 2t = + 1 arid a/ « — 1 (w is the primitive root of unity of 
order 2r). Notice now that La2i(co 2 ) IJ is the (j + l)-st component of, 
the discrete Fourier transform of the r-eomponcnt vector formed 
by the even-indexed coefficients of a (this Fourier transform obvi- 
ously uses the root /of unity co 2 , which is of order r). Similarly, 
£a 2 i + i(co 2 ) ,J is the (j + lj-st component of the discrete Fourier 
transform of the odd-indexed coefficient sequence. Thus, we find 
that (aV) J + l and (aV) J + 1+r are computable by the arrangement 
shown in Figure 2. We see that the DFT of a 2r-component vector 
is obtained with r multiplications, r additions, and r subtractions 
from the DFTs of two r-component vectors. "Assuming that q is a 
power of 2, the same analysis can be carried out for each of the two 
FFT-computers for r inputs, and we reach the conclusion that the 
FFT calculation requires 0(q log q) arithmetic operations. A simi- 
lar result holds when q is a highly composite number, although' we 
are explicitly interested in the power of 2 case. 

The use of the Fourier matrix certainly simplifies the operations 
of interpolation and evaluation; however, there is still the difficulty 
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Fig. 2. A scheme for computing (a V)j +1 and (»Vk+i±r . 
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that it requires 0(q log q) complex multiplications if co is the qth 
primitive root of unity in the complex field. Schonhage and Stras- 
sen avoided this difficulty by the following extremely brilliant so- 
lution. Supppsc that the two operands A arid B are integers rep- 
resentable with at most n = 2 m ~ l bits. Then, the product AB < 
2 r < 2 2 " + 1. The integer 2 2 " + 1, denoted by F m , is called the 
Fermat number of order m and the set Z m of integers modulo F m is 
known as the Fermat ring of order m. Notice that as long as 
A • B e Z m , we can assume that arithmetic be done in Z m . A cru- 
cial property of Z m is that 2 is a primitive root of unity of order 
2 m+l ; in fact 2 2 * =* -1 in Z m . It follows that multiplications of an 
integer by powers of the root reduce to shifts of the representation 
of this integer, provided} we use 2 m+ l bits for representing the 
operand's. We now have developed all the preliminaries to the de- 
scription of the integer multiplication algorithm. 

2:1A The Schdnhage-Strassen integer multiplication algo- 
rithm: In Figure 3 we sketch the relative length of the nonzero 
portions of the representations of the operands and of the product. 
Letting m = 2s — 1 (the case of even m is analogously handled), we 
chop the bit sequence of each operand into 2 8+1 segments of 2"~ l 



r bits each and let A|_i(B| 



segment in A (in B), for i = 1, . 

2» M - 

1 A= X A ^ 2 
Next, we compute , 

AB = C= Z 



t ) be the integer represented by the ith 
2 s+l . Thus we have 



B = 



1 = 0 



l* * 1 — I 
j«0 



k = 0 



( Z Mj) 

mod2** J 



E 

k=0 



C k 2 



k2 f ' 



(5) 



00... 
00,.. 

00.;. 



n+1 — 

...00 
...00 








1 





Fig. 3. Alignment of operands and result. 
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It is convenient to split the lust summation into two parts, i.e., 
C= E C k 2""' + lTc k 2«-' 

= ZC k 2"- + ZC k + 2 .2 k2 -' -2*\ 

4b' ko ° k "° 
Recalling that 2 1 ' = — 1 we have 

••'.'V " / - 2* — I 

C= Z(C k -C k + J .)2 k2 ' M . (6) 

Notice now that (C k — C k+r ) could be a negative integer. How- 
ever, C i <2 ,+1 • 2 V for j =0, 1, ..., 2 ,+1 - 1 (since, by (5), it is the 
sum of 2 s+l products whose moduli are bounded by 2 2 '), whence 
(C k — C k + 2 ')> — 2 ,+ l+2 '; since it wilt be convenient to deal with 
positive coelTicicnts, if we add and subtract the same quantity from 
the right member of (6), we obtain 

■ k»0 . ■ . , ' 

+ X 2' + 1 + J, 2 k r "' a Z z k 2"-' 

where each z k satisfies the inequalities ' \ 

0 < z k < 2 , + 2 + 2 ' <S 2 S+2 (2 V + 1) - 2* +2 F s . 

(Notice that 2 S+2 and F 9 are relatively prifne.) At this point, the 
original multiplication is reduced to the problem of releasing the 
carries when adding the appropriately shifted numbers z 0 , z x , 
which are shown pictorially in Figure 4 We must still find an 





f -25 "1 

s-t-2 




^" 2S "1 























z 0 



Fig. 4. Alignment of z„, z,, z 2 , ... . 
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cfTicicnt way for computing tho coefficients z k . A most unexpected 
method for effecting this computation is provided by the following 
algorithm: 

Computation of z k 

1. Computez k Az k 'mod2 ,+ 2 ; 

2. Compute z k A z k mod F, ; 

3. Reconstruct z k from z k and zj[ . 

The correctness of this procedure is supplied by the Chinese Re- 
mainder theorem, since z k and z k are the. remainders, modulo two 
relatively prime integers. We shall now discuss how steps, 1, 2, and 
3 can be implemented and analyze their complexities. 

Letting a, = A,mod 2 , + 2 and ft « Bjinod 2 ,+2 (for ij » 0, 
2 , + 1 - 1), we obtain 

z' k - [ I {«ift) + £ («ift)l 

-P £ >(«ift) + I («,ft)l(mod2' +2 ); 

■ Li+J-k + 2 4 i+J-k+3-*. J 

The terms of the form I| +j „ k (o£jft) mod 2 I+2 are easily computed 
by forming two integers A"* and B' whose representation are shown 
below. . 

A's0...0a r M,j0... ... 0... Oct! o/.,0a 0 

B's0...0ft 1+ u 1 Q... ... 0...0/?! 0...0j» 0 

where the number of the zeros separating «| and + 1 is chosen as 
the minimum required for avoiding any propagation between two 
consecutive columns when performing the multiplication A' • B'. 
Since each column is the sum of at most 2 ,+ 1 terms upper-bounded 
by the values (2 ,+2 ) 2 we see that (3s + 5) bits suffice to hold ftich 
sum. It follows that A' and B' can be formed as integers with 
2 , + 1 (3s + 5) bits each, which can be multiplied in time at most 
0([2 ,+1 (3s + 5)] ,08l3 )<0(2 2 '), i.e., in time less than linear in the 
length of the original operands. 

Next, we shall consider step 3, i.e., the reconstruction of z k from 
z' k and zj. Notice at first that z' k = z k -jF,, for some unknown 
integer j < 2 s + 2 . If we can determine j from z k and z k , we obtain^ 
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z k v Now wc have: 

z2«z k -jF t »z k -j2 r -j, 
whence z k mod 2 ,+2 « z k mod 2 ,+2 - j, that is, 

z£mod 2 ,+a »X-j- 

We conclude that j = z k - zj[ mod 2 ,+2 . Thus the computation of j 
requires time OJs+~2) and the computation of z k from z k and j 
requires time 0(2 ,+ 1 ). We conclude that the reconstruction of all 
z k 's from the corresponding pairs (z' k , z k ) requires time which is 
linear in the length of the original operands. 

' Finally, we consider the computation qf z' k , which turjns°out to 
dominate with its complexity the operations leading to the compu- 
tation of z k . Since z k e Z t , it is natural to carry out the operations 
in 2 |f i.e., we must compute 

Zk [(C k ~C K + 2 .)mod F, + 2 ,+ l t 2, ]mod F f . 

The teriii (C k - C k + 2 .)m°d F, is expediently computed by using the 
FFT algorithm. Specifically we have ' 

1. Compute the forward FFT in 2, of the sequences (A 0 , A lf 
A y »i - 1 \ (B 0 » Bi, By.»„ j ); 

2. Multiply corresponding terms of the transforms (this is done 
by recursive calls of the multiplication algorithm in Z s ). 

3. Compute the inverse FFT. 

To analyze the complexity of the overall multiplication algo- 
rithm, let N(m) be the number of operations required to multiply 
two integers of lerffcth 2 m « 2 2, ~ ! . Steps 1 and 3 have global com- 
plexity 3 x (arithmetic pperiations for FFT of 2 s terms) x (length of 
the operands) = y 0 [2 s y(s + 1)]2' for some Constant y 0 \ Step 2 
involves 2 >+1 ipultiplications in Z,, i.e., it has complexity 2 9+1 N(s). 
Thus we obtain 

N(m) <> 7o 2 2i+ Hs + 1) + 2 ,+ r N(s) + 0(2 2 '). 

If we solved this equation, we would obtain the result N(m) =?= 
0(2 m m 2 ); however, there is a further simplification that, almost 
magically, allows another reduction in complexity. In fact, we need 
not compute the 2 ,+1 tenhs C k mod F, , but only 2* differences of 
the form (C k — C k + 2 0niod F,; It is easily shown from the proper- 
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tics of the FFT that these differences arc completely, determined 
from the odd-indexed terms of the transform product; hence step 2, 
in this revised form, requires only 2* multiplications in Z,. It fol- 
lows that 

N(m) <; y 0 2 2, + i (s + 1) + 2'N(s) +.0(2 2> ). 

If we assume now-for simplicity that m « 2s — 1 ==» 2(2si - 1) 
— 1 a,..,wc obtain after one step 

N(m)^7o2 2> + l (s+ 'l) + 2 l [y 0 2 a, '* l (»i + ») 

+ 2 l 'N(s,) + 0(2*')] + 0(2 2 ') + o(2 2 ') 

v «yo2 2 ' +1 (s+l) + y 0 2 2 ^^ 

+ 2 , + >1 N(s l ) + 0(2 2l ) + o(2 2i ). 

Thus, each time that we approximately halve the number of bits of 
the operands, wc add' a term of the forrti 0(2 m m); this process will 
be repeated 0(log m) » 0(log log n) times, whence we obtain the 
upper-bound expressed by the following theorem: 

Theorem. Two n digit operands can be multiplied with at most 
0(n log n log log n) bit operations. 

2.2. Sorting by Merging. The problem of sorting n elements of a 
totally ordered set A (typically, n numbers) is one of the most 
celebrated and thoroughly studied examples in the area, commonly 
called combinatorial or nonnumerical computation. The fact that the 
elements dealt with are numbers is basically accidental,, since the 
key operation used is "comparison," i.e., a test by which we can 
detect the ordering relationship between two elements of the set A, 
and the algorithms specify the strategy of execution of the compari- 
sons to obtain the desired sorting. 

There is a voluminous literature on sorting (see, e.g., [4], [9], 
[15]), concerning both practical algorithms and some deep theo- 
retical questions, some of which are only partially answered; the 
interested reader is encouraged to refer to it. Our objective in this 
section is to discuss a specific and very important sorting tech- 
nique, .called sorting by merging (or, briefly, merge-sort), in different 
processing environments: the single conventional processor, the 
network of comparators, and the parallel processing system. 

. 23i .' ; 
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Let us first consider the tasic idea of mcrgc-sort Assume that 
the elements to be sorted are placed in a unidimcnsional array A 
and let A[i] denote the 4th clement of this array; also, let A[i: j] 
denote the segment A[i], A[l + 1], A[j]. Suppose now wc Have 
an algorithm called MERGE (A It A 2 ), which accepts two ordered 
sequences A t and A 2 and combines them into a single ordered 
sequence. Wc then have the following algorithm: 

Algorithm SORT (A[l:n]) 

Step 1. Set A[l:rn/2l] *- SORT(A[l:fn/21]) and 
A|Jn/21 + 1 : n] - SORT(A[fn/21 + 1 : n]J. 
- Step 2. Set A[l:n] <- MERGE(A[1 :[n/2']l AtfrU2\ + l:n]) 
and h^lt, Jf/f 

Notice, incidentally, that this is a recursive algorithm whfcti con* 
■ tains among its steps "calls'* to itself for operating on inputs of, 
increasingly smaller size (in this case, geometrically decreasing). In 
words, wc split the original array into twoYegments off approxi- 
mately the same size, sort them separately, and finallysyse ;thc 
MERGE operation to combine them. Thus, we can have as)ntany 
distinct algorithms which comply with the description givcnl above 
as we have ways of specifying the MERGE operation. 

A simple merge algorithm constructs the combined sequence 
term by term. The first term is the smaller of the respective smallest 
terms of the two sequences to bp merged; this term is riemoved 
from the input sequence to which it belongs and transmitted to the 
output sequence, and exactly this process is repeated until the 
input sequences are exhausted. Less descriptively, let A[l :n] and 
B[l:m] be two input sequences with A[l]^;- • A[n] and B[l]^; 
• • £ B[m], and C[l : n + m] » MERGE(A[1 : n], B[l : m]). The se- 
quence C[l : n + m] is constructed as follows: 

Algorithm MERGE (A[l:n], B[l:m]) 

Step 1. Set i <- X j «- 1, k«- L . . 1 
Step 2. If i > n, set C[k : m + n] <- B[j : m] and halt. 
Step 3. If j > m, set C[k:m + n] A[i:n] and halt. 
Step 4. If A[i] £ B[j], set C[k] «- A[i] and i «- i + 1 ; 
• else set C[k] 4- B[j], and j «-j + 1. 

Step 5. Set k k + 1 and go to step 2. / 
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We now analyze the time complexity of the giyeA algorithms. 
Since, in our scheme, sorting depends critically on mergiqg, we 
begin from the latter. Notice that each comparison between A[i] 
and B[j] is accompanied, by a fixed amount of work in the loop 
consisting of steps 2, 3, 4, and 5. Since this loop is executed at mo^t 
(n +.m - 1) timesT we concRide that the running time ; M(n, m) 
required to merge two sequences of respective |^gths n and m is at 
most 0(n + m)/W/th this result, denoting by S(n) the time required 
to sort n numbers, a straightforward analysis of our sorting algo-: 
rithm shows that step 1 runs in time 2S(n/2) and'istep 2 runs in time 
M(n/2, n/2).-Thus we have the recurrence equation 

^ . S(n)'= 2S(n/2) + M(n/2, n/2) = 2S(n/2) + O(n) 

which is solved by standard methods as S(n) = 0(rf log n) It can be 
shown that this order of complexity is optimal (see Section 2.3); 
moreover^ if one exclusively counts comparisons, merge-sort re- 
quires a number of comparisons which diflfers from the optimal . 
only by additive terms which are asymptotically negligible with 
respect to n log n. ^ \ , V 

The merge algorithm we have just described is of the sequential 
type, l;e., the operations are executed in sequence. Such is the pro- 
cessing environment of a conventional computer, also referred to as 
the sequential processor. We qpw waht„tQ investigate whether sort^ 
ing T^y merging lends itself to implementation, by a computing 
system in which several operations can be performed, concurrently, 
or, equivalently, in which several sequential processors can simul- 
taneously operate on the same data set. Any such system is nor- 
mally referred to as a parallel system. 

The advantage that one expects in going from a sequential algo- 
rithm to a parallel algorithm is a speed-up of the computation; in 
other words, one trades time with equipment. More specifically, 
assuming that all processors considered are. constructed with the 
same technology (i.e., liave the same operational speed) and as- 
Wming that Tj is the running time. of the best-known algorithm to 
sdlve a given problem on a single processor, the most that one can 
I hd$e for in employing k processors for the same problem is to 
achieVe-a-rohning time T k = Tj/k; normally, however, except for 
some very particular problems, T k > Tj/k, that is, there is some 
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loss with respect to the optimum speed-up. It is also appropriate to 
mention that the case in which k is fixed is referred to a bounded 
parallelism, whereas unbounded parallelism denotes the case in 
which as many processors are available as one sees fit. p 
In connection with the sorting by merging , problem, we shall 
consider two instances of parallel computing: thfe network of com- 
parators and the fully parallel system (shared-memory-machine). 

The network of comparators is an interconnection of modules 
called comparator^ A comparator is a .two-input, two-output 
device which receives two numbers on its input lines an$njlaces the 
larger number on a specified output line and^the smaller on the 

/other. A network of comparators has an arbitrary interconnection, 
except for the constraints that there is no feedback and that each 
comparator output line is either connected to exactly one compara- 
tor input line or is a network output line; moreover, each network 
input line is connected to exactly one comparator input line (fan- 
out restriction). It is easy to realize that a network of comparators 
is a parallel system. In fact it may be convenient to think of it as a 
cascade of stages, which form a partiti<Mi of the network modules, 
with the property that the modules of a stagecan operate in paral- 
lel. A partition of a network into stages can be obtained very 
simply ; think of the network as a directed graph, whose vertices are 
the comparators and the network input lines and whose arcs are 
the connections, directed toward comparator input lines. With each 
comparator -We associate ail integer i, called the level, which is the 
length of the longest path from the input lines to that comparator. 
Defining as a time unit the time required for a -comparator to 
operate, it is clear that a comparator at level i will not have its 

^operands available before the (i — 1)-St time unit; therefore, if all 
the comparatofs at level hare placed in the ith stage, it is clear that 
they qan operate in parallel. Notice also that, since at any given 
level each operand can be at most on one comparator input line, 
each stag^cuitains at most [n/2J comparators. This indicates that 

. a comparator network with n input lines and with the stated fan- 
out restriction is equivalent to a system of [n/2J processors oper- 
ating in parallel and accessing an n-^cell memory. At each time unit, 
or step, each processor reads two operands from memory, com- 
pares them, and stores into memory the two results. The read-store 
scheme aj the ith time unit is completely specified by the wiring of 
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the ith stage of fhe equivalent network; it follows that this scheme 
is fixed and is-not influenced by the, outcomes of previous 
comparisons— that is, the algorithm embodied by the* network is 

nonadaptive. ° \ ■y'.*'*: 

We shall now analyze the/ number of stages, i.e., the time re^ 
Quired by a network that sorts n numbers by repeated merging. 
For simplicity we shall assume that n is a power of 2, that is, 
n = 2 k . A ^number sorting network can be constructed as shown 
in Figure 5, i.e., it consists, of two 2 k ~ ^number sorting networks 
operating tp parallel followed by a.2 k -number merging network. 
Clearly- the structure of this network entirely reflects the or- 
ganization of the algorithm SORT previous^ illustrated, and our 
problem is reduced to the analysis of the merging network. The 
latter can be constructed as follows. (This very interesting construc- 
tion is due to K. E. Batcher [16].) Let A = (a 0 , ) and 
B ==(b 0 , ...vb 2 ^»-V) be the two sorted sequences to be merged 
with a 0 <> a t <> • <> a*-..,," and £> 0 £ b x <;•• • ^ b 2 *-«-i. The 
merging scheme is also of a recursive type (Figure 6); that is, we 
separately merge the even-indexed terms and the odd-indexed 
terms by means of two 2 k " 1 -number merging networks operating 
in parallel, and( combine the outputs of the latter. This combination 
is very simple to implement. In fact, referring to Figure 6, assume 
iftductively that the elements on the output lines of the merging 
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Fig. 5. A merge-sort sorting network. 
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Fie. 6. Batcher's merging network. 



networks appear in increasing order front top to bottom. Consider 
the element c { which appears on the ith output line of the EVEN 
merging network and suppose, without loss of generality, that it 
coincides with b 2s (for some 0 <; s <> 2 k ~ 2 — 1 ). Clearly on the first 
(i -^1) lines of this EVEN network there are s elements of sequence 
B and (i — 1 — s) elements of sequence A; this implies that the first 
(i — s ~ 2) odd-indexed terms of sequence A are no larger than 
c, = b 2s .It follows that the elements appearing on the (i — s — 2) + 
s = (i — 2) top lines of the ODD merging network are known to be 
^no larger thai) q. By a similar argument one can show that the 
elements appearing on the bottom (2 k ~ l -> i + 1) lines of the ODD 
merging network are known to be not smaller than q, so that we 
conclude that C| must b^compared only with the element on the 
(i — l)-st line of the ODD merging network. Thus the merging 
process is completed by a single stage consisting of(2 k ~ l — 1) com- 
parators following the EVEN and ODD merging networks; hence ' 
a, 2 k -number merging network can be constructed which consists of 
Iog 2 2 k = Ibg 2 n stages of comparators. This result can be used to 
evaluate the number of stages of the sorting network, which is 



log 2 2 k + log 2 2*~ l + 



+ Iog 2 2 = i log 2 2 k (log 2 2 k + 1) 
= i log 2 n(log 2 n 5 + 1). 
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Thus we-see that this parallel sorting algorithm does not achieve an 
optimal speed-up with respect to the best-known sequential sorting 
algorithm, the loss being a factor approximately equal to.£ log 2 n. 

A most surprising fact, however, is that the order of , time re- 
quired by merging with a network of comparators cannot ]be im- 
proved upon, as the following argument, due to Floyd (see [15, vol. 
3, p. 230]) shows. Let C(2t, 2t) be the minimum number of com- 
parators of a network that sorts an input sequence ax, a 2 , 4 4t , 
consisting of two interleaved sorted sequences a x ^ a 3 ^ ••• ^ 
84,-! and a 2 ^ a 4 < a 4t . JEach comparator is characterized 
by a pair of indices (i, jj/if it compares a, and a^. Divide the 
comparators into three classes: Class £ 2t, j £ 2t;. Class 2: 
i £ 2t + 1, j £ 2t -r 1; Class 3: i ^2t, j ^ 2t + 1, Clearly Class 1 
-must form a merging network for 2t inputs, since a 2t+1 , ...» a 4t 
may already be in their fin^ arrangement; so does Class 2. Finally, 
the input sequence for which a 2s > a 2r _ 1 for s, r = 1, 2t re- 
quires at least t exchanges between the first half and the second half 
of the input sequence, so that Class 3 must contain at least t 
comparators. We conclude that 

C(2t, 2t) ^ 2C(t, t) + t 

i.e., C(2t, 2t)^t log 2 t. Since each stage can contain at most 2t 
comparators, the number of stages is at least i log 2 % thus proving 
the original claim. 

It is. now natural tb ask the question whether additional speed- 
ups can be obtained by employing O(n) processors, without the 
restrictions embodied by the ; network constraints. In other words, 
we assume that a given element can be simultaneously compared 
with more than another element, and we let the algorithm be 
adaptive. No better answer was available until very recently, when 
Valiant [16] proposed the following interesting parallel merging 
algorithm. ; 

Let A[l :n] and B[l :m] be t>vo sequences sorted in ascending 
order, with m>n^4, and assume that L^/njmJ processors are 
available. We partition sequence A into segments, the first elements 
of which are A(i[y/a\ + t] for i = 0, 1, L>/nJ - 1; similarly, B 
is partitioned into segments whose first elements are B[i[ > /mT+l]. 
Let Apfv^l + 1] a A'[i] and B[j[7mi + 1] a B;[j] for 
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■it j = 0, 1, ... . We now perform the following operations: 

Algorithm PARALLEL^MERGE 

Step 1. Compare in parallel each element of A' with each ele- 
ment of B'. ? 

Comment: The total number of processors required is ly/n] • 
L«s/inJ ^ L>/nmJ» i-e , there are enough processors to carry out this 
step in one unit of time. 

Step 2. For each A'[i] (i = 0, 1, .... U/nJ decide the 
segment of B into which it must be inserted. 

Comment: Let P H be the_processor assigned to compare ATfl and 
B'[j] for i, j « 0, 1, ... , Since %'[0] • ^ B'[Ls/mJ - 1] the se- 
quence consists of two not simultaneously void segments P i0 , 
P ik and P| -k +i. p i.ufin]-i> such that for j^k we have 
A'[i] £ B'[j] andJbr j > k we have A'[i] < B'[j]. Thus k is deter- 
mined in fixed time by letting each Pjj compare the outcome of its 
comparison with those of Ps.j-! and Pij + i (ignoring for simplicity 
the end-effect). * 

Step 3. Insert each A'fi] into the segment of B determined in 

Step l[ by comparing it simultaneously with each ele- 

V- ment of the latter except the first. 
- ■ / ■■ <■ , 

Comment :* There are (r>/ni1 — 1) comparisons for each A'[i]; thus 

- 1) < U/nniJ processors are sufficient. The insertion 
can be done ir} unit time, by an argument analogous to the one in 
the Comment to Step 2. This insertion induces a new segmentation 
of B into B[kj + 1 :k i+1 ], where B[k,] < A'[i] < B[k, + 1] for 

.i = i,2,..., Lv^J. , • ,.. ■ } ' 

Step 4. For i == 0, L>/nJ — l simultaneously merge in 
: parallel Apr>/n1 + 1 :(i + 1)1^1] and ^[k, + 1: 
■/ ' ki+i]. " , ■ ■ . ' ' ■ ■ 

Comment-ir-This step specifies the simultaneous execution of ly/n] 
parallel merges of -sequence pairs. The ith merge operation in- 
volves two sequences of respective lengths X| and yi , where Xj ^ 
f y/n~] — 1 and yj is the number of elements in B[k| + i:k i+1 ]. We 
now assume inductively that this merge can be done in parallel 
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with L\Ai yd p roces sors. The total number of required processors 
is given by SLv/xiyiJ- By Cauchy inequality we have 

;J *: - : : : : 

recalling that Zxi = n — L\/nj an d £ yi = m ^ e obtain '• 

SLv^iyd ^ 2\/xiy! ^ V(2x,X2y,) = V(n - Lx/n])m 
1 = Vn^ • ^1-^ <; Lv/n^J,' • 

where the latter inequality holds for m ^ n £ 4. Thus there is a 
sufficient number of processors to be distributed to execute the 
simultaneous merge operations. 

- ^/c can now evaluate the running time of the described algo- 
rithm/Steps 1, 2, and 3 each require a fixed amount of time*; the 
recursive call represented by Step 4 involves sequence pairs whose 
smaller member has size at most [ x /ri'\ — 1. Thus the problem size 
has been reduced according to {he square root. It follows that the 
merge is completed with a recursion of depth at most log 2 log 2 n, 
i.e.,- the running time of the parallel-merging algorithm is 
O(loglogn). 

This merging algorithrrLcan now be ufced to obtain a fcpt parallel 
algorithm for sorting n numbers with n/2 processors. Assume for 
simplicity that n = 2 k . We begin by forming sequences of length 2 
and at each subsequent step we merge sequence pairs whose 
common length doubles at each step. At the ith step we merge 2 k "V 
sequenced pairs, of common length 2 1 " 1 : by the preceding dis- 
cussion this_can_be_done by the described merging algorithm with 
2 k ~" 1 -(Lv/2 1 " 1 r 2 l-1 J) = 2 k_1 processors, which are available by 
hypothesis. Thus in log 2 n merging steps sorting is completed; 
since each merging step runs in time at most 0(log log n), sorting 
can be done in parallel with n/2 processors in time at most 0(log 
n • log log n). . 

; Thus we see that the speed-up loss with respect .■ to the best 
known sequential algorithm ha^ been reduced to a factor 0(log 
log n), a substantial improvement brought about by removing the 
fan-out restriction and by devising an adaptive scheme. Noticei 
however, that the preceding analysw of Valiant's algorithm is re^ 
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stricted to the so-called "data dependence" of Sorting and ignores 
the complexity of another important facet of the algorithm, called 
"data movements." In fact, the adopted computation model con- 
sists of a set of identical processors, eaeh capable of random access- 
ing/a common memory; of course, an additional device— called an 
intlercpnnection network, or, briefly, a switch— is needed for aligning 
each processor with the memory cell accessed by it. The work done 
by this network is referred to as "data movements." - 
/ Finally, we mention that more recent results [17]^ [18] exhibit, 
for the same computation model, enumeration-sbrtihg parallel al- 
/gorithms, which have better time performance than the known 
merge-sorting pafallel algorithms described above. 

2.3. Convex Hulls of Finite Sets of Points, increasing attention is 
being currently devoted to the computational solution of problems 
of a geometric nature, for they occur in a number of fields, such as 
operations research, pattern recognition, design automation, , and 
statistics. This irtterest has coalesced into a new •branch of compu- 
tational complexity, adeptly called computational geometry by 
Shamos [^9], The objective of computational geometry is to recast 
geometric problems, some of which are classical, into a framework 
which makes them amenable to efficient computer solution. The 
techniques used in connection with geometric problems are essen- 
tially those employed in other areJa$ of concrete computational 
complexity, with the additional featiire that one must discover 
properties of the geometric objects involved which will simplify the 
computational task. 

An interesting problem in this area~is the determination of the 
convex hull of a finite set of points in Euclidean space. As is well 
known, the convex hull of a finite set of points S is the intersection 

_of all convex sets which contain S. Clearly, this definition is totally 
useless from a computational standpoint. In this respect, a more 
useful definition regards the convex hull H(S) of S as a subset of S 
with the property that any point of S is obtainable as a convex 

: linear combination of the points of H(S).* In general, the convex 



* A linear combination is said to be convex if its coefficients are nonnegative and 
add to 1. 
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hull H(S) of a d-dimcnsional set Silk a ^convex polytope in d-space, 
which becomes a convex polyhedrcfii and a convex polygon in; 
three and two dimensions, respective^. A d-dimensional pofytope 
K is bounded by (d — l)-dimensional\polytopes, called facets or 
faces, each of which lies in a hyperplane: in two dimensions the 
faces are line segments and in three* dimensions they are convex 
polygons. ; 

With this nomenclature, it is relatively easy to understand a 
general convex hull algorithm, due to Chand and Kapur pff\. This 
algorithm is based on an idea; 'which is commonly referred to as the 
"gift-wrapping principle." Let K be a d-dimensional convex poly- 
tope and assume that a facf f of K is given. This face is delimited 
by (d 2)-dimensional.polytopes, called hyperedges^(\\ne segments 
or points in three and two dimensions, respectively), and let e be a 
hyperedge of f. The edge e and every vertex of K determine a 
hyperplane. Fbr any such hyperplane we compute the inner prod- 
uct of the orthogonal unit vector with the unit vector orthogbnal to 
the hyperplane containing f. The hyperplane for which this inner 
product has largest absolute v^lue contains a new face of K. Thus, 
t>y -going from face to face, one can identify all the faces of the 
convex polytopes, and the choice of the phrase "gift-wrapping prinv ? 
ciple" is entirely justified jn its intuitive three-diitiensibnal interpifeV 
tation. The preceding informal discussion also shows that, gjveri a 
set of n points in 3-space, application of the gift-wrapping principle 
will identify the vertices of the convex hull. To analyze the compu-; 
tational effort, notice that the determination of each new vertex of 
the hull involves work O(h); since all n points could belong to the 
hull, the total resulting work is 0(n 2 ) in the worst case. 

Thejquestion has been raised whether, faster algorithms can be 
designed in cases of low dimensionality. The answer is affirmative 
both for two and for three dimensions, where algorithms are 
"knowri"with running time at most 0(n log n) for sets of ri points. 
Since the three-dimensional algorithm is quite involved [21], it will 
not be discussed here; rather, we shall discuss two two-dimensional 
procedures involving quite different techniques and yet achieving 
the same order of time complexity. In this manner, it is possible to 
gain considerable insight into the techniques of computational 
geometry without the burden of complicated details. 
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ingtfie specific algorithms, which obtain the so- 
called ordered convje^JUiliri^ the sequence 6i the vertices of the 
convex hull polygon, it is wortlrpointing out that the order of their 
time performance is optimal. To shdwtot^wsjr^t first specify 
the computation model. We shall adopt a randomaccess-machW 
(RAM) with the variant that integer arithmetic is replaced by real 
number arithmetic. Of course tjiis does not entail infinite length 
operands; it only means that finite length operands are approxi- 
mations of real numbers, as it normally happens in floating-point 
arithmetic. . %1 ' . | 

A common method t6 (establish a lower bound for the compu- 
tation time of a problem P r is a suitable reduction to P x of some 
problem P 2 for which a lower bound is known. By "suitable re- 
duction" we mean that P 2 may be reformulated as P t with an effort 
whose complexity is not greater than the lower bound. In our case, 
the following argument— due to Stan Eisenstat [19]— shows that 
the problem, of sorting n numbers can be reduced to finding the 
convex hull of n points in the plane. This transformation is doable 
in time 0(n), as follows. Let U = {x x , x n } be a set of , n num- 
bers; corresponding to each x l construct — with;, 'a; single 
multiplication — the point (x,, xf). : The set of pbipts S == {(x,, xf)[ 
Xj e y} lie on the parabola y = x 2 (a cojivex curve); so that the 
ordered convex *hull of S gives the sorting of S. Since sorting is 
known to r^uire.Q(n log n) operations, the argument is complete. 
We shall now illu^tate injspijie. detail two convex hull algorithms 
for the plane.^/^^ " l v 

The first procedure we shall consider, which was also one of the 
first to appear, is due 'to R, L. Graham [22]. Let S be a set of n 
points in the plane. The points of S are assumed to^be expressed in, 
polar coordinates (p, 0) ^with respect to some point P internal to 
H(S) and some arbitrary reference half-line. (Should; the points be 
expressed in a different coordinate system, the conversion can be 
effected in time O(n): in fact, P can be found as the;cfcntrpid of S in 
time O(n) and the coordinate transformation requires ;a fixed 
amount of work per point.) The points (p k , 0 k ) lire then sorted in 
order of increasing 0 and let (r l5 4>ih (^2i'4i\-'-''i''X^n9'4!ti ^ * e 
resulting sequence with 0 ^ <f> x < • < 27t. This sorting oper- 

ation requires work 0(n log n). Next the algorithm scans the se- 
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quence, each step consisting of the application of the following rule 
to the three-point configuration illustrated in Figure 7: 

If a -h ^ 7t, delete (r k+1 , <£ k Vi) aifd substitute the index triplet 
(k - 1, k, k + 2) for the triplet (k, k + 1, fc+ 2); else, set k <- k ' + 1. 
and proceed. 




P 



Fig. 7. Illustration of key feature of Graham's algorithm. 

Clearly, each application of this rule either eliminates one pre- 
viously examined point (r k+1 , <£ k +i) or it examines a new point 
(r k + 3 , <£ k+3 ). It follows th$pthe rule can be applied at most twice 
per point, for a total of 2n applications. Thus^ the initial sorting 

• pass determines the ^m^aity. of the algorithm. 

The following v coQy^^^^fearithm is due to Shamos [19] and 
involves an entirely. dinin^®miique. It is based on finding the 
hull of the union of two convex polygons. It can be shown that the 

m latter can be found in time at most proportional to the total 
number of vertices of the two polygons. To avoid inessential de- 
tails, we assume for simplicity that n be a multiple of 3. The points 
of S are then arbitrarily partitioned into n/3;subsets, each contain- 
ing 3 points. Each such subset determines a triangle, i.e., a convex 
polygon (Figure 8(a)). We consider disjoint pairs of triangles and 
replace them with the hull of their union, a convex polygon with at 
most 6 edges; this can be achieved with work at most 0(^/3) for 
some constant k x (Figure 8(b)). Next we pair these pojygons and 
again find the hulls of their unions : this stage also requires at most 
0(2^/6)^ 00^/3) operations. Therefore at each stage we use 
OOqn/3) operations to halve the number of polygons: clearly, after 
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- (a) (b) (c) 

Fig. 8. Illustration of a convex hull algorithm by Shamos. 



flog 2 nl stages, i.e., with 0(n log n) operations, H(S) has been com- 
puted (Figure 8(c)). < 

Notice that both algorithms have optimal order of time com- 
plexity. Only the former uses sorting explicitly, whereas the latter 
mimics the general merge^sort technique (see Section 22) in the 
construction of convex polygons with increasing numbers" of 
vertices. , -* 

Other optimal convex-hull algorithms for planar sets are known, 
but their presentation exceeds the scope of this chapter; the interes- 
ted reader is referred to [19] for further details on this topic and for 
further exposure to a variety of problems of computational 
geometry. 
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COMPUTER SCIENCE AND 
ARTIFICIAL INTELLIGENCE 

James R. Slagle 



This article examines the role of computers as intelligent ma- 
chines. A pivotal point in this discussion is that these machines are 
performing activities often called intelligent when performed by 
"humans. Included herein are machines that play games, prove 
theorems, solve calci/lus problems, aid in the manipulation of 
mathematical expressions, discfern and differentiate among chemi- 
cal structures, and direct the activities of physical robots. These are 
not hypothetical machines -virtually all of these projects have been 
reduced to practice oft properly operating digital computing sys- 
tems. Moreover, each machine (or, at least, its crucial component) 
appears extendable, so thht its operating characteristics can be ap- 
plied to more difficult pr&blems. 

1. CHARACTERISTICS OF ARTIFICIAL INTELLIGENCE ■ , 

f: > 

X fundamental motivation in this fi6ld has been to devise infor- 
mation processing systems (i.e., computing equipment executing apr 
propriate algorithms) whose behavior is considered to be intelli- 
gent. That is, we are willihg to ascribe that term to the same 
activity when observed in a human. ' 
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Because the term itself is so highly evocative, it will be helpful to 
state some, dictionary definitions of intelligence so that there is a 
common reference point. Webster's New Collegiate Dictionary 
(1956 edition) defines intelligence as: 

: ' V. * • ' ' ' '.■'■'.■••■'!■'. k " ' '■ 

A. The power of meeting any situation, especially a novel situation, sue- , 
ccssfully by proper behavior adjustments. B. The ability to apprehend inter- 
relationships of presented facts in'such a way as, to guide action toward a/; 
desired goal.* "' <,- ! - h'- ' 7c ' :■ ■ ?. ■ *\ 

While we may resist the idea emotionally, it is not difficult to apply 
these definitions to the behavior of a machine as well as to that of a 

.. human; Intelligence is multipurpose in nature* and involves the 
ability tb y learn. This characteristic (or/ at- least, an excellent imita- 

, tiolfi of it) is fundamental to heuristic algorithms and programs, land 
we shall have more to say about the desirability of incorporating it 
into a wide spectruih of comlputifig activities. '| * 

Almost all of the machines to be discussed in this article .are 
high-speed general-purpose stored-prbgram electronic digital com- 
puters. In thei^most important aspects, they fall into the general 
category of Von Neumann machines outlined in the first article of 

v this study. Accordingly, for purposes of this article, we are: con- 
gent rating* on activities that are implemented as sequential 
processes. \ 

1.1. Approaches to Artificial Intelligence. Researchers charac- 
terize artificial intelligence in one of three ways : 

1. artificial networks 

2. artificial evolution 

3. heuristic programming. ; ■ 

Since it i§ currently impossible to say anything-conclusiye about 
the preferability of one approach over the. others as being the 
model that parallels (natural) intelligent activities most^ Closely, re- ^ 
search toward and implementation of intelligent machines contin-r 



* By permission from Webster's New Collegiate Dictionary, copyright 1916, 1925, P 
1931, 1936, 1941, 1949, 1951, 1953, and 1956, G. and C. Merriam Company, Spring- 
field, Massachusetts. > . v > * 
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ues to ( reflect the approaches m^ost congruent with the perceptions ; 
of the respective investigators. Accordingly, this article will empha- 
size the Heuristic approach, with the others receiving only brief 
mention. x v 

A network consists of an arbitrary number of simple elements 
along with the interconnections among them. An artificial network . 
may be' a physical reality or" it may be simulated on a computer. 
Very often it is helpfuLtg perceive of each element as ajv artificial 
neuron. One advantage^)? this approach is that the network usu- 
ally is ad^Dtwe; that 4s, it can "learn" from experience. Researchers 
who take^tre artificial-network approach tend to' perceive natural ' 
intelligence as being based on (natural) neural networks alone. At 
present, artificial networks have "learned" to recognize simple 
visual arid aural pattern's, a level of performance considered to be *; 
short of intelligent behavior. One difficulty with this approach is 
that there is little prospect' of producing an artificial network that 
approximates the size and complexity of a brain (i.e., in the order of 
magnitude of 10 10 neurons). Another deterring factor -is out inconi- 
plete understanding regarding the operating characteristics of and 
interconnections among neurons. 

In ■', the artificial-evolution jlpproach 'to artificial intelligence, 
computer-simulated systems are designed to evolve by "mutation" 
and "selection." Using predefined criteria for determining such * 
selections, systems have been devised which evolved into vehicles ' 
'for solving simple equations. Proponents of this approach point 
>out th^t many people think that human intelligence evolved 
through a process in which mutations and natural selection played 
crucial roles. Here, again, natural evolution is not understood suf- 
ficiently to enable close parallels to be drawn. Moreover, the 
analogous process in computing systems must proceed at an enor- 
mously accelerated rate (in comparison *to natural evolution) in 
order for itsjresults to have-substantial practical impact. 

The approach addressed here revolves around the use of heuris- 
tics. These are rules of thumb, strategies, methods, or tricks aimed 
at improving the efficiency of a system which tries to discover 
solutions to complex problems. Another trip to the same dictionary 
produces the following information for "heuristic": "Serving to dis- 
cover/* It is related to the word "eureka" (*T have-jpund it," from ' 
the Greek heuriskeiri, to discover, tafind). Heuristics, then, fo$m the 
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centerpieces for many artificial intelligence systems. Some of the 
heurisdi programs to be discussed- dan play checkers and chess; 
others caii deduce answers to questions from a store of given facts; 
still others solve calculus problems or prove theorems in math- 
ematical logic apd geometry. For example, there are heuristic ge- 
qmetry programs - «that. can prove the following rather difficult 
theorem: . \ 

' If the segment joining th^ midpoints of the diagonals of a trap- 
ezoid is extended to intersect the side of the trapezoid, it dissects 
' that side. < * ; 

Some other heuristic programs can "learn" from their experience. A 
few are multipurpose in the sense that they; are useful in solving 
several kinds of problems, i.e., they are applicable to several do- 
mains. For example, a heuristic for "working backward" (in a sense, 
"undoing" an activity that was found to be futile or unproductive) 
is useful in many theorem-proving and pattern-matching domains. 
Other heuristics are very specific, limited to one problfcm-sblving 
domain (such as theorem-proving in geometry). 

1.2; Purposes of Heuristic Programming. When used in the con- 
text of artificial intelligence, a heuristic program reflects the follow- 
ing motivations: • 

1. an attempt to gain additional understanding o^naWal intelli- : 
'genceT .-./'/ ^""^W;; 

2. development and use of machine intelligence to acquire know- 
ledge and solve intellectually difficult problems, x :n 

A researcher concerned with the first aspect, for example, may be a 
psychologist interested in exploring some facet of Human behavior 
regarded as being intelligent. Based on ^personal observations, 
available experimental data^and reports 'in ', the literature, the inves- 
tigator will define a logical structure intended to model the beha- 
vioral aspect of interest. When implemented as a heuristic program, 
that model can be run and its results compared against those ob- 
served experimentally. This validation process is Repeated/ with 
each cycle serving to identify (and remove) discrepancies in the 
model so that it becomes a progressively, improving reflection of 
the perceived reality. (Further discussion of this validation process, 
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in a more general context, is found,elsewhere in this study, in Mark 
r Franklin's article on computer simulation.) Deficiencies in the 
model revealed Joy this iterative process may prompt further experi- 
mentation, and the process continues until the modeled results arew 
in stable agreement with those obtained over the observable 
domairi.;Once that has been achieved to the investigator's satisfac- 
tion, the behavioral aspept under scrutiny then can be "observed" 
•simply by running the program on the computer, 
; A researcher motivated by the second purpose is interested in 
8 producing intelligent behavior, with less concern as to whether or 
hot the underlying process duplicates or parallels that employed by 
humans. His hope is that the computer, driven by a heuristic pro- 
gram, eventually will solve important complex problems in the 
physical, biological, and social sciences. 

In this article we shall be concerned primarily with the second of 
these two purposes. While routine use of complex problem-solving 
computing systems still lies in the future, there are intelligent sysr 
terns in current use whose behavior attests to the fact that there has 
been impressive progress toward that end. Some of these will be 
examined and discussed in subsequent sections. 

-2.' PROGRAMS THAT PLAYGAMES 

An important conceptual operation in many heuristic programs 
involves the selection^ logical possiblilites whose respectivelconse- 
quences are more desirable than others. The collection \» these 
possibilities can be specified quite effectively when represented in 
the form of a tree. (As Figure 1 shows, this data structure is 
characterized more accurately as an upside-down tree, since the 
"root" is at the top.) While tree-structured data aTe useful in a wide 
variety of information-processi|4g&x)ntexts, we shall focus on two 
types that are of particular interest in heuristic programs r game 
trees and goal trees. \ V ; ; . 

The branches in a game tree represent moves, replies, and coun- 
terreplies. In a goal tree, some ultimate goal is sho^kto be achiev- 
able if certain subgoals are achievable.) A subgoal, l^turn, may be 
shown to be achievable if certain of its subgoals are achievable, and 
so on, to some arbitrary level. (As we shall see later on, it is pos- 
sible to establish equivalence between ^certain types of game trees 

7;^ 
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j Fig. i. An explicit game tree. 



and goal trees.) Since trees have a tendency to become very large, 
even when representing modest collections of logical possibilities, a 
crucial component in many heuristic programs is a procedure for 
searching these trees effectively. Consequently, a substantial 
amount of associated work is concerned with finding ways to mini- 
mize the search for relevant parts^of the tree by "anticipating" (and, 
therefore, avoiding) ultimately fruitless searches. 

Since our immediate concentration will be on game-playing pro- 
grams, we shall examine game trees more closely. 

2.1. Characterization of Game Trees. A game tree is either ex- 
plicit or implicit As the name implies, an explicit game tree is ' 
shown in its full structure (Figure 1). Each move is depicted, along 
with its consequences/ An implicit game tree, on the other hand, is 
described only in terms of an initial position and a set of rules for 
generating the tree. The game of checkers is an appropriate exam- 
ple: By knowing" the characteristics of the board, the starting posi- 
tions of the 24 pieces,^ and the rules governing- their legal move- 
ments, we can generate a tree that depicts completely the conse- 
quences of each possible move under each possible set of circum- 
stances. ■ .V / 
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Referring to the explicit T trcc in Rgtirc 1, we sec that the nodes 
(squares, circles, and trjangles) represent game positions, with the 
top node representing the starting point. The connecting line scg- , 
mcnts, then, represent moves. A mode's shape defines the player 
whose turn it is to move. Thus, the square player makes the first 
move from position P. If the square moves to position P 3 , then the 
square loses while the circle and triangle both win. We shall say 
that position P is at level 0 and its successors (i.e., P|, P 2 , and P3) 
arc at level 1. The successors of P's successors arc said to be at level 
2, and so on. In Figure 1, assuming a good play on the part of the 
other players, the square playercan force a win by selecting a move 
to position P 2 :, the only good move for the circle player, then, is to 
the square-position from which the square (expectedly) will cause 
himself (and the circle) to win. Note that an actual move transforms 
a game tree into another game-tree. For example, the move to 
position P 2 transforms thetree in figure 1 into the tree in Figure 2. 

An implicit tree consists of a top position together with rules^ 
which can be used to generate successors of many positions. These 
rules include termination criteria so that no successors can be gen- 
erated for a position meeting these criteria._A procedure which 
starts at a particular position and follows the game's rules is 
termed a generation procedure and, in effect.^sueh a procedure 
converts an implicit tree into an explicit one. 

Generation procedures vary in the order in which they generate 
positions of the tree. A rough but helpful categorization contrasts 




o loses o wins 
o loses o wins 
a wins a loses 



Fig. 2, Transformation of Figure 1 resulting from movement to position P* 2 . A 
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breadth-first procedures with depth-first procedures. Generally > 
speaking, the former type generates from the top. That is, it prb- 
duces all the positions at level 1, then all the positions at level 2, 
and so on. For example, suppose that thevtop position of an im- 
plicit tree is the square-position P shown in Figure 3 and the 
generating rules are as follows: #■' \ 

1. A square-position generates two positions at the next lower 
level: a circle-position to the right and a triangle-position to 
the left. 

2. A circle-position generates two lower-level positions: a 
triangle-position to the right and a«square-positiori to the left 

3. A triangle-position generates two positions at the next lower 
level : a circle-position to the right and a square-position .to 
the left. 

Application of these rules in a breadth-first procedure, then, gen- 
erates the explicit trees in the order shown in Figure 3. 

A somewhat more complicated parallel can be cited using the 
familiar gamp of checkers/ The rules of the game, applied in a 




(b) generation of positions at* level 2 



Fig. 3. Breadth-first generation procedure. 
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breadth-first procedure to the front rovy (the only one that can 
move when the system is at the top position), generate seven suc- 
cessor positions. Completed application of the procedure to the 
successors generates seven level 2 successors for each one, so that 
we produce an explicit tree at level 2 with 49 positions. 

A depth-first procedure generates a tree from left to right in the 
following sense: The procedure starts by generating the top posi- 
tion's first successor and then, in turn, its first successor at the next 
level, and so on. To illustrate, suppose that we have an implicit, tree 
with tlie same top position and generating rules as the one in 
Figure 3. In addition, we shall define level 3 as the maximum 
depth, thereby imposing a termination criterion. The square- 
position at level 0 is labeled A in Figure 4 for convenience. Now, if 
we apply the generating rules in a depth-first procedure, the posi- 
tions will be generated in the order A, B, C, D, and E. Since the 
termination criteria have been met for that branch of the tree, the 
procedure would then pick up the alternative branch, starting from 
position F and working to complete that branch to the maximum 
depth: This progression is shown in Figure 4(a) and 4(b). Implicit in 
this progression is the fact that a termination criterion always must 
be given for a depth-first procedure. 

When a depth-first procedure is applied to checkers, the 49 posi- 
tions comprising the top two levels of the checker tree would be 
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generated as before. However, the sequence in which they are gen- 
erated is different: After generating the first successor position, its 
seven reply positions will be generated.. Only then will the depth- 
first procedure return tadevel 0 to generate the second successor 
position, and so on until each group of 7 reply positions has becrt 
generated from its respective level. 1 position. 

2.2. Purposes for Programming Gatne-Playing Computers. Aside C 
from the obvious recreational interests in producing game-playing 
computing systems, these games are useful vehicles because, despite 
their relative simplicity, they reseipble many important real prob- 
lems. In many cases, the complexity of the real problems makes 
them highly resistant to direct attack, so that the researcher hopes 
that the methods developed 'to handle simple problems can be 
extended systematically to encompass a wider range of complexity. 
The resemblance between games and real problems certainly is not 
farfetched: The intelligent participant chooses actions based on his 
search of the tree of future possiblilites, on his rough evaluation of 
possible future situations/ and on expectations about what others 
will do. Business games, medical games, and war games are in- 
tended to serve as models for real problems. 

Games tend to be' relatively simple because their rules are well 
defined and relatively straightforward. The selection of games as 
arenas for developing and improving heuristic programs reflects a 
conscious decision on the part of«researchers to the effect that the 
advantages of simplicity outweighed those associated with closer 
approximations to reality; (The implication is that a working heu- 
ristic for a well-defined system can ultimately be extended to ac- 
commodate a more realistic situation that is less well-defined.) 
Moreover, there are systems that a!re both realistic and well-defined 
on which heuristics can be made to work quite, nicely. Theorem 
. proving and assembly-line balancing are two such examples. 

2.3. General Description of Game-Playing Programs. Most of 
the games implemented by the programs described in these sections 
can be characterized as two-person strictly competitive games. The 
first of these characterizations is self-explanatory ; "strictly competi- 
tive" (or zero sum) refers to the fact that whatever one player wins 
the other player may be considered to lose. Thus, the outcome of 

25* ' - , . 
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such a game may be defined by describing 'what happens to just 
one of the player^ (By implication, then, cooperation in a zero sum 
game is never worthwhile.) Very often such programs play against 
other programs but, for interest, we shall frame our -discussion 
around a machine player and a human opponent/ 

Clearly, the ability to follow rules, and generate positions is only 
a small part of the overall game-playing structure. Another impor- 
tant ingredient is the ability to evaluate how good a position is for 
the machine (i.e., how bad it is for the opponent), The program 
evaluates well to the extent thaKit assigns high (positivd) values to 
good positions and low (negative) values to bad ones. Since the 
game is strictly competitive, the negation of a particular value 
serves as an estimate of how good that position is for the human 
opponent. Using the perspective established by the foregoing dis- 
cussion, we can turn to a general description of the consecutive 
steps followed by game-playing programs. Exceptions will be noted 
as necessary. (For purposes of calibration, a a step" in this context 
corresponds to a sequence of program instructions numbering in 
the hundreds.) 

A. The human submits an. arbitrary position in the game. Pro- 
ceed to step B or step E according to ' whether it is the 
human's or thfe computer's tiirn to move. 

B. The^uman submits his move. 
C Generate the new position. 

D. If the termination criteria are met (i.e., the game is bver),^ 
print the result and stop; otherwise go to step E. 
; E. Generate some or all successors to the (top), computer 
position. ' " 

F. Evaluate each successor. (Note that this step may be arbi- 
trarily complex, involving elaborate searching of future pos- 
sibilities and their respective consequences.) 

G. Move to the successor with the highest value. 

H. Print the computer's move. A 

I. If the game is over, print. the result and stop; otherwise, go to 
■ step B. ' ' / • ..' - v;;^ 

The success of these heuristics, even with' surprisingly complicated 
games, is manifested in the proliferation of amusingj challenging, 
(and even ego-debilitating) computer-based competitiye'.gai^res. ■ \ 

■ ; "7 /: ; ' ... ; 25o 
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Before looking more closely at checkers and chess (the tradi- 
tional paradigms in artificial intelligence for examining hcur^tics 
• potentially useful in other problems), it will be helpful to redefine 
the major ingredients in a heuristic search procedure: . ; 

1. A generation procedure. 

2. A (static) evaluation function. 

3. A backing-up procedure. 

The first of these has been discussed before, and needs no further 
elaboration for our current purposes. The property of an evalu- 
ation function that makes it static is the fact that it assigns a value , 
to a given position without generating any of its successors. In 
contrast, a backing-up procedure a'ssigns a value to a position 
based on the values of that position's successors. 7*^1-^. ■> 

2.4. Evaluating a Position in Checkers and Chess. /The nidst com- 
monly used backing-up procedure is the minimax procedure: Since 
the program's static evaluation function assigns a numerical value 
to each game position such that the greater the value of the func- 
tion, the better the position tends to be, we can think of the ma- 
chine as bein^ the maximizing player (or, simply, Max). Corre- 
spondingly, its opponent can be considered the minimizing player 
(or Min). A position in which it is Max's turn to move is called a 
Max-positibn. For example, in Figure 5, square (i.e., the computer 
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Fig. 5. The minimax backing-up procedure. 
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or maximizing player) uses its given termination criteria to deter* 
mine not to generate any* positions below level 3i Having es- 
tablished these terminal nodes, let us suppose that the static evalu- 
ation function assigns the value v: lu = 40 for position P in . Simi- 
larly, the value v: 112 - 10 is assigned to position Pn 2 , vr^i « 30 
for position P 12l , and so on, for all 8 positions as shown in the 
figure, We shall not dwell here on hoW these static evaluations arc 
assigned. It should be said, however, that much of the exploratory 
1 work in an artificial intelligence system centers around the determi- 
nation of a meaningful evaluation function. Having obtained these 
values, the minimax procedure will back up the value of the best 
successor of the Max position* i.e., the one with the maximum 
value. In the case of Figure 5, the maximum successors for each of 
the four level 2 Max-positions are P 1U (value =.40), P 12 i (value 
= 30), P 212 (value = 29) and P 222 (value = 80). Since the positions 
at level 2 are Max-positions (i.e., successors to their respective Min- 
positions), procedure will back up the value of the best successor of 
the Min-position.' By definition, this is the one with a minimum 
value. Hence the procedure backs up the value of 30 (at position 
P 12 ) to position P lf and the value of 29 (at postion P 21 ) to position 
P 2 . As a result, then, the backing-up procedure would lead square 
to select a move to position P x over P 2 . This minimizes the 
"damage" that the opponent can do by making his best moves. If 
square moves to P 1? circle's best move is to position P 12 , in reply 
to which square's best move is to P 121 , with a terminal value of 30. 
The alternative (i.e., aa initial move to position P 2 ) prompts a reply 
on circle's part to position P 21 from whence square's best move (to 
P 212 ) produces a terminal value of only 29. 

2.5. Elaborations oh Search > Procedures. Having established the 
roles of a static evaluation function, a backing-up procedure, and a 
generation procedure in an overall search system; we shall look at 
some possible improvements in these processes. 

One version of the basic search procedure combines a static 
evaluation function and a minimax backing-up procedure with a 
depth-first generation procedure. This approach has been employed, 
successfully in a number of game-playing programs and will serve 
as a helpful precursor to the more effective alpha-beta search. To 
illustrate its nature, let us suppose that the tree of Figure 5 had 
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been given implicitly and wc start'again at the top level, i.e., posi- 
tion P in Figure 5.' This litis been relabeled A in figure 6(a) in order 
to emphasize a sense of sequence. As- Figure 6(a) indicates, the 
depth-first procedure generates positions B, C, and D. Since the 
termination criteria arc met at that level, the procedure uses its 
static evaluation function on position D, producing a value, say, of 
40. It then generates position E and computes a value for that 
position (namely, 10). The bctter-of these two values, namely, 40,. is 
backed up to position C. The procedure then generates F, the 
alternative Max-position at level 2, from which it generates and 




Fig. 6. Depth-first minimax procedure. 
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evaluates positions G and 'Hi This is shown diagrammrttically in 
Figure 6(b) where the two terminal nodes arc shown to have 
assigned values of 30 and 20, respectively. The better of these two, 
namely, 30, is backed up to F, at which point it can select tho.hcttcr 
value between C and F rtnd back it up to B.j(Rccall that "better" in 
this instance is the minimum value, i.c, 30.) Now, the procedure can 
gcricratc I and J, followed by generation and evaluation of K and L 
(Figure 6(c)). The backing-up and generation process continues 
until the procedure obtains the result shown'iiY Figure 6(d). Conse- 
quently, the choice will be a move to position B, since it hasji 
higher value than the alternative position, ^c., I. 

FjTom the foregoing discussion, it is seen that the generation of 
successor positions is separate from their evaluation That, is, a 
scries of successors is prepared before any evaluation is undertaken. 
Although this procedure works (limited, of course, by the efficacy 
and reality of the evaluation function), it is inherently inefficient 
since it generates a number of ultimately inferior positions. The 
"alpha-bcta"v|>roccdurc overcomes this deficiency to a,considcrablc 
degree. It is equivalent to the depth-first minimax procedure in that 
it will always choose the same move as the other, given the same 
top position, termination criteria, and evaluation function. How- 
ever, the primary difference lies in the fact that generation and 
evaluation arc interleaved rather than sequenced. Thus the alpha- 
beta procedure almost always chooses its move after generating 
only a very small fraction of the tree produced by the equivalent 
depth-first minimax procedure. The strategy pivots around the 
computation of two limiting values at a given Max-posi'tion: Alpha 
is a backed-up value for a position that is computed after .a depth- 
first procedure followed the generating rules until termination cri- 
teria were inct. This is not. necessarily a final value; rather, it is ah 
inttrim evaluation, serving as a minimum limit. In this capacity, it 
regulates tie generation of other parts of the search tree. If the 
procedure inds that the value of another Max-position at that level 
does not exceed alpha, then there is no reason to continue gener- 
ating successors from that position. Similarly, a maximum limit, 
beta, is ? ^^fpUshed for a Min-position. / 

Additional insights into the alpha-beta procedure can be gained 
by considering these simple 'exercises: In Figure 7, we see part of a 
tree in which an^nterim backed-up value v^ = 3 already has been 

• ' • Sou . . 
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■ / < Fig. 7. The alpha-beta procedure finds ah alpha cutoff. 

•'•*.' . •*. ' " •' ' * / ■. 

".' '• ■ " • #> . • " m ■ ' • : & .■ 

established for position^!. Consequently, it uses that as a limit 
'! (alpha) to be impbsedron position P 2 . Now, suppose that the pro- 
cedure, through 'evaluation, assigns a value of 2 to .position 
, this value Were to be. backed up to its predecessor (positib^K^^ 

* would fall below 'the limiting value (alpha) established i^^jM 
position.. Consequently, the procedure reaches wfiat is temiejS: an 
aipha cutoff: We see that there is no point in generating any other 
successors to P 2 (i.e;, P 22 , P 23 , arid all of their successors). The 
procedure can "conclude" that Max will not choose the move to 

: position P 2 because if is already better off (at least so far) by 
mdving to P^ Having eliminated this move from contention, there 
is no ne,ed to generate its successors, arid the alpha-beta procedure, 
instead, can go on to generate* another position at P^s levels, -i.e., 

^P 3 '.-' '• , .■■ ; '/ : 

A similar example is seen in Figure 8: after obtaining y : u = 4 

• from position P n , the alpha-beta procedure sets a limit of beta = 4 
at«P 12 '. Nbw, suppose that the evaluator assigns v: 121 = 8. to posi- 
tion P 12i . If this value were to be backed Up to position Pi 2 , it 
wouldr exceed the proyisiorrSI limit of '4 established at that position. 
Thik, then, would be a (beta cutoff. From Max's point <of view, a 
move from position Piiis less desirable in comparison to a move 
from position P n . Therefore, there is no need Jo generate any other 
successors of P 12/ (§uch as P 122 , P 123 , and so bn). Instead, the 
alpha-beta procedure; can drop back to P t and generate another 
successor (i.e., P£ 3 ) from there and repeat the process of compari- 

^|0n ; against beta. ' ? . . ' 

^^|?is evident, then, that,the savings re^ii^ed with the alpha-beta 
&cedure increased dr^m^ically wiiteftie number of levels that 
^V^to be generated b^fce 7 the termination criteria are mpt'. This 
ferms the basis for further improvements in the search procedure: 
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Fig. 8. The alpha-beta procedure fitf js a_ beta cutoff. 



Clearly, there arfe ftirthe^vgains to be realized if the number of alpha 
and beta cutoffs can be^ihereased: Moreover, those gains would be 
intensified if such cutoffs could be made as early (i.e., as high in the 
tree) as possible. Accordingly, additional variations can be intro- 
duced to reduce the' sOmewhaf arbitrary rfature of theT depth-first 
alpha-beta procedures. Inevitably, these enhancements complicate 
the overall search procedure, so^that there always is a necessity to 
consider the tradeoff between the increased efficiency and attendant 
complexity. .". : >.-- .. 

Many of these 'ehhancemetits are ^centered around the idea of 
shallow searching. That is, instead of following a depth-first search 
down to its terminating nodes, a depth-first minimax procedure ior 
: ail alpha-beta procedure may Se interjected across several positions 
at a given (relatively high) level. Intermediate results obtained can 
then be used to ^prejudice, subsequent behavior, of. the deeper 
searches. One way of exploiting the shallow search's results is to 
order subsequent searches so that the first one in a series is likely to 
be the best one (i.e., the one producing high alpha/low beta values), 
thereby resulting in early cutoffs for subsequent searches. This 
approach is called plausibility ordering. v 

In> another, related approach, called forward pruning, not all 
successors of a gtfsn position are searched. The time saved by 
avoiding (presumably) unpromising branches may be used in 
searching more, fruitful bqes to a deeper level. This advantage must 
befcalanre<d%^ of failing to searchtelevttnt branches, 

Onj^ method'^ n-best (forward, pruning, 
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allows the search to proceed below only the small n seemingly best 
successors of a position. , v , 

* 2.6. The Evaluation Function. It was mentioned earlier that the 
determination, of the static evaluation function for a given game- 
playing system is itself the subject of considerable inquiry. For the 
sake of. simplicity, many evaluation functions often are chosen to be 
linear. That is of the form c l y l + c 2 y 2 + + c n y n . This may be 
represented as the scalar product of 2 vectors, i.e., C • Y. Each y r is ' 
; some real-valued function called a feature of the position. For ex- 
ample, such a feature might be a piece advantage, relative mobility, 
etc. Each coefficient Cj is the Weight of the corresponding y } . A 
miniature evaluation t function in chokers, for instance, is ■> 
6k + 4m + u, where k is the king advantage, small m is the (plain) 
man advantage, and u is the undenied mobility advantage. The , 
coefficients for these features, are 6, ,4, and 1, respectively. Suppose, 
' then, in a position to be evaluated, the computer has two more 
kings; two fewer men, and one more unit of undenied mobility than 
does its opponent. The evaluation function would assign a value of , 
6(2) + 4(~2) + 1 or 5 to that position. 

2.7. A Program That Plays Chess. There are numerous chess- 
playing programs at various levels of proficiency that embody the 
types of procedures outlined earlier. An example is a program writ- 
ten by Richard Greenblatt, } Donald Eastlake III, and Stephen . 
Crocker (1967) at the Massachusetts Institute of Techn^gy. This 
program used the alpha-beta procedure combined with plausibility 
ordering and ri-best forward pruning bf moves. In addition, the 
program is equipped^ with book openings and other items of "chess 
knowledgec^Typicaliy, the program makes its. move (in a game) in 
about a;minute. A good player (not to mention an expert or 
master) usually beats' this or any other cjtiess, program, but the task;j 
is' becoming increasingly difficult as these program, designs im- 
prove. This particular program is an honorary member of the 
United States Chess Federation and v t» Massachusetts Chess "As- % 
sociation under the the name of Mac Hack Six. The program plays - 
in tournaments and is operated via telephone lines from a teletype 
:at the tournament sight. In an April 1967 tournament, the program 
won the Class D trophy; Over the past few years computer-leased 
chess tournaments have become a staple feature of most national 
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Table 1(a) 

A Tournament Game Lost by the Mac Hack Six Program. 
Black is Mac Hack Six; White is a human rated 2190. 
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BxP: Bishop tkkes Pawn 
Castling: 0-0 j- -King side'; 

0-0i0 *• queen side. 



computer conferences, with programs playing other programs. As a 
matter of general/interest, Tables 1(a) and 1(b) reproduce two tour- 
nament games played by Mac Hack Six. These are of particular 
interest because the former (a loss) was the first tournament game 
ever played by a computer (January 21, 1967), and the latter was 
Jhe first tournament/game ever won by a computer (a rating of 
2190 represents thaj/of an expert, almost a master) * 

2,8, .Conclusions to Be Drawfi from Checkers and Chess-Playing 
Programs, programs described in this section typically require 

* In l&g, Hans Berliner's backgammon program defeated a world champion in a 
tournament. ' : : 
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in the order of one minute per move, whereas tournament rules in 
chess and checkers allow four or five minutes per move. Checker 
programs play an excellent game (by -human standards), whereas 
chess programs are not quite so adept. In checkers, the successful 
player is able to perform fast and accurate searches on rather large 
trees. In chess, however, success is associated with the ability to 
invent and use strategies. The successful chess player mixes abstract 
thinking with move-by-move analysis. This selectivity enables him; 
to restrict his searches to relatively small trees. Accordingly, re- 
searchers are working to mechanize these types qf heuristics. 

In a sense, the heuristics embodied in these game-playing pro- 
grams emulate a kind of learning., The checker program, for exam- 
ple, uses a type of generalization in developing evaluation functions ; 
that are more effective than their predecessors. This has led to 
practical procedures for good (but not optimal) evaluation func- 
tions in a variety of contexts. Many researchers in artificial intelli- 
gence feel that these techniques will provide Valuable insights into 
human learning processes. Such accomplishments, however, still lie 
in the future; it is safe to say that, at present, no program l^®s 
better than the checker program. 

Table 1(b) . . 

■ A Tournament Game Won by the Mac Hack Six Program. 
W > - : White is Mac Hack Six; Black is a human rated 1510. . 




Move No. 



White 



Black 



2 
3 
4 
5 
6 
7 
8 
9 



P-K4 
P-Q4 
->QxP 
Q-Q3 
N-QB3 
N-B3 
B-B4 
B-N3 
0-0-0 
P-QR4 
K-Nl 
QxQP 
B-R4 
N-Q5 



P-QB4 
PxP 
N-QB3 



N-B3 / 



P-KN3. 
P-Q* 

PrFC4 



P-QR3 "* 
P-QN4 ' 
B-R3ch 



10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 

2>r 



N-B7ch 



QxQ 
Q-Q6 
Q-Q5 
NxP 
QxNch 



P-N5 

B-Q2 

B-N2 

NxP 

QxN 

N-B4 " 

B-KBl 

R-Bl 

B-K3 

RxQ 
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3. PROGRAMS THAT SOLVE PROBLEMS IN CHESS, * 
GEOMETRY, AND CALCULUS . 

* •"*.■*■ • 
, . / .... 

A wide variety of- intellectually difficult problems, including 
many problems in chess, geometry, and indefinite integration, share 
a certain common structure which can be represented in terms of 
implicit trees. Heuristic programs have been written to search such 
trees, ancr experiments with working computer programs have 
produce^ successful solutions to some fairly difficult problems. For 
example, Baylor and Simon (1966) performed such work with chess 
problems; Gelerntier and his co-workers (1959, 1960) did related 
work with geometry problems, and this author (1963) performed 
such experiments with problems in integration. The speed of these 
programs compares* favorably with that "of a skilled human 
problem-solver. Accordingly, research continues toward machine 
solutions for increasingly difficult and important problems. 

3,1. Representation of a Problem as an Implicit Tree* Geometry 
and chess problems are good examples of a fairly general kind of 
problem that can be represented by two kinds of implicit trees. 
These two representations ultimately will be shown to be equiva- 
lent. A chess problem, foj instance7inay be represented as an im- 
plicit, two-person~strictIy competitive game tree. Recall that the 
trees discussed in the previous section were conceptually similar, 
witK the exception that they jvere explicit. The tree in Figure 9 



□ 




Fig. 9. Top three levels of an implicit game tree 



owins • \ a wins j 
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illustrates this structure. We see that there are terminating nodes 
which, if reached, guarantee a win for the computer (i.e., white, or 
square). The problem, tfien, is to search (i.e., make explicit) enough 
of a game tree to prove that the computer can force a win. Figure 
10 shows the appropriate solution: The hibavy solid lines represent 
ia proof' that the computer indeed can force a win by selecting the 
right-hand~move from the given starting position. 

A geometry problem may be represented as an implicit and/or 
goal tree. The top, three levels of suchj a tree are illustrated in 
Figure 11. In this context, the problem is to prove some geometric 
conclusion, e.g., that two angles are equal, given certain hypotheses. 
Looking for a proof corresponds to thejsearch of an implicit goal 
tree whose top goal (the node labeled G in Figure 11) is "to prove 
the conclusion given originally." As implied in Figure 11, the top 
goal G is achievable if the disjunction^ (i.e., the inclusive OR) of 
goals Gj and G 2 is, achievable. This disjunction is represented by 
the square shape of G. If G, say, is to trove two angles equal, the 
goal Gj might be to prove that the angles are corresponding parts 
of congruent triangles, and the goal Gj 2 might be to prove that the 
anglfes are alternate interior angles of parallel lines. Referring again 
to Figure 11, we see that the goal G 2 ' is achievable if the conjunc- 
tion (i.e., the logical AND) of G 2 ^and G 22 is achievable. This 
conjunction is represented by the circular shape of goal G 2 . Figure 
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J goal goal 
ediately immediately f 

aviiievable achievable 

Fig. 11. Top three levels of an implicit goal tree. - " 

12 shows the solution by making the apprppriate parts of the tree 
explicit, Here again (as in Figure 10), the heavy solid lines represent 
the prbof. •» ' • 4 

Comparison of Figure 9 with Figure 11, and Figure 10 with 
Figure 12, shows that the two representations are equivalent. 
Specifically, the chess problem could have been represented just as 
well by an implicit, and/or goal tree; and the geometry problem 
could have been represented as art implicit, two-person, strictly 
competitive game tree. We shall see that .a problem in indefinite 
integration also can be represented in either way. 
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3.2. Objectives in Developing Problem-Solving Programs. The 

motivations that prompt the study of programs 1 for 
lems in chess, geometry, and indefinite integration are similar to 
those underlying the development of heuristics for game-playing 
programs. These problems, in effect, are simplified versions of real, 
potentially important ones. Moreover, since many people are famil- 
iar with tljese kinds of problems, they are in an excellent position 
to evaluate the solutions given by a program,' comparing its per- 
formance with that of humans. Thus these probtem typfes serve as 
vehicles for the development of procedures to handle a more gener- 
al range of problems, i.e., those problems representable by an im- 
plicit, two^JpKon, strictly competitive game tree or, equivalently, 
by an^ihpliciti and/or goal tree. Such procedures have the potential 
of yiclding ? val61able insights into the problem-solving process. 

As for purposes specific to each program, the geometry program 
studi^the use of models and the integration program is potentially * 
useful in itself. (The latter, for example, was extended to a calculus 
program capable of handling definite and multiple integration.) The 
geophi^ry program, in using a diagram as a model, rejects subgoals 
which do Hot conform to this model. This is particularly interest- 
ing, since humans use models to great advantage in successful 
problem-solviHg. ^ 

3.3. General Description of the Procedure. While a detailed 
examination of these problem solutions i^ell. beyond the scope of 
this introductory overview, we can characterize the general ap- 
proach in terms of a sequence of steps designed to navigate 
through, the goal tree to a satisfactory arrival at the originally 
stated goal or to a definite conclusion that the goal cannot/be 
reached. At the start, the procedure "knows" what the original goal 
is and has at its disposal a spectrum of defined resources: allowable 
transformations- that it can perform when in a given/state, and 
knowledge (e.g., axioms, other already proved th£orejris) that it can 
use to reduce the problem to a combination,™ suoprobl^TiMhat 
are handled more easily. Within this general framework, we can 
enumerate the following basic sequence of &teps> 

A. If the program succeeds in its atterppt for an immediate solu- 
tion with the original goal, print the answer and stop. 

B. If the program cannot proceed-because it has exhausted its 
available resources, print thislSct and stop. . I 
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C. If no untried goals remain, print this fact and stop. 

D. Select an untried goal as a basis for generating additional 
parts of the tree. 

E. If no more (sub)goals can be generated from the selected 
goal, go to step B. h / 

F. Generate the next untried goal from the selected goal. 

G If the program fails in its attempt for an immediate solution 
with the newly generated goal, go to. step E. On the other 
hand, if the try succeeds, prune^the goal tree with respect to 
this goal. The pruning will have one of the following three 
results: 

1. If the original goal is met, prflit the answer and stop. * ^ 

2. If the original goal is not met but the selected one is met, 
go to step B. 

3. If neither the original goal nor the currently selected one is 
met, go to step E. - 

Research in the improvement of" these problem-solving programs 
continues, fueled by a 'growing conviction that the^ resulting 
methodology constitutes the beginning of a general problem- 
solving theory. % 

4. AUTOMATIC THEOREM-PROVING USING THE RESOLUTION PRINCIPLE 

- w 

An alternative to proof-finding as an approach to automatic 
theorem-proving is the idea of consequence^finding. Programs 
using this latter approach, such as the one written by R. C. T. Lee 
in 1967, start with a collection of axioms and try to deduce conse- 
quences from, these axioms and select those that are "interesting " It 
turns out that both approaches use the resolution principle, a natu- 
ral and powerful rule of inference. 

It is this use of deduction that provides a substantial part . of the 
motivation for studying automatic theorem-proving. Of coufse the 
basic activity itself is intrinsically interesting^sifice proving a non- 
trivial theorem is an intellectually difficult problem. Interest is 
heightened, moreover, by the application of first-order predicate 
calculus to the design of such programs. In mathematical logic, one 
can express fairly conveniently almost all kinds of deductive argu- 
ments. Thus writing a theorem-proving program that uses predi- 
cate calculus allows the researcher to study deduction in its purest 
form. This ability (on the part of a program) to make deductions 

' J - ; . : 270 
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from given facts has been characterized by Professor John Mc- 
Carthy (1959) as common sense. When exercised in humans, it is 
considered an important part of patural intelligence. The>obvious 
extrapolation, then, is irresistible: A program that uses mathemat- 
ical logic to find proofs can be extended to deduce answers to 
questions. Such extensions have beem implemented successfully 
using frameworks of knowledge ranging from a set of major league, 
baseball statistics to the text of a children's encyclopedia. 

Clearly, the extension also can be taken in another direction: a 
future program that proves new and interesting theorems would be 
useful in itself. It would be a tremendous achievement, for instance, 
if soihe undefined but imaginable program of the future proved or 
disproved the famous Fermat or Goldbach conjecture. Providing 
additional motivation is the fact that mathematical logic is well 
suited to computers. Since logicians have labored for decades to 
make their rules of inference "mechanical," it is an attractive idea 
to develop computerised algorithms based on mathematical logic, 
since this is a well-formulated, well-studied branch of mathematics. 
Thus, writing and -using theorem-proving programs is a way to 
study mathematical logic. 

P C Gilmore (1960), H. Wang (1965), and M. Davis and H. 
Putnam (I960) are among the early investigators who used first- 
border predicate calculus in the design of theorem-proving pro- 
grams. Using formal inference rules, the programs progressively 
substituted constant terms for variables in well-formed formufas, 
checking at each step to see whether .the theorem hack bj^n proved.^ 
In 1965, J. A. Robinson developed an inference rate^talled the 
resolution principle, which served to unify many pf : the^elatively 
fragmentary theorem-proving algorithms in, use at that time. The 
resolution principle seeks to draw the most general possible con- 
clusion from two given statements, serving to delay (or even elimi- 
nate) the need for progressive substitution of literals in place of 
variables. The resolution principle is more natural, more intuitive, 
and easier to use than are the formal inference rules it replaces. 

4,1, Characteristics of the Resolution Principle, We shall examine 
the basic; use of the resolution principle by charting its course 
through some simple examples. This will provide background from 
which we can establish some generalizations about it. - 
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Example 1. We are going to seek proof of a simple theorem 
through use of the resolution principle. The statements given below 
are understood to hold for all values of their variables.. For in- 
stance, the statement PI holds for all x, for all v, and for all y. This 
is. true in the same sense in which the identity x 2 -y 2 = (x + y) 
(x — y) holds for all x and y. 

Theorem 1. Suppose i > ' 

PI : If x is part of v, and if v is part of y, then x is part of y. 
P2: A finger is part of a hand. 
P3: A hand is part of an^arm. 
P4: An arm is part pf a man. 

v "' . 
From PI through P4, we may conclude that a finger is part of a 

man. /' "* 

Proof of Theorem 1. A procedure that tries "to- find proofs using 
the resolution prinfciple first takes the denial of the conclusion and 
then tries to deduce a contradiction. In our current example, this 
denial can be expressed as follows: 

P5: A finger is not part of a man. , * 

To reach this denial, and its accompanying contradiction, the pro- 
cedure produces the following consequence: 

P6: If hand is part of y, then finger is part of y. 

This is called a resolvent of PI and P2. (More about this later.) P6 
is obtained by first "matching" (making identical) the clause P2 arid 
the first portion of clause PI by letting x be "finger" and y &e 
"hand." This substitution in PI gives the following intermediate 
result, a logical consequence of PI: 

PI': If a finger is part of a hand and a hand is part of y , tHen a 
finger is part of y. ;*».■'•• 

This is indeed a logical consequence of PI, since PI is asserted to 
be true for all x, v, and y, and therefore it must be true in the 
special case when .x is "finger" and v is "hand." The clause P6 is an 
immediate' consequence of PI' and P2. Usually, the' resolvent P6 
would be given directly without expressing the intermediate result 
PI'. ' 

272. r 
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Each of the three portions of the clause PI is called ait atom. For 
example, the second atom in PI is "v is part of y Thys, each of the 
clauses P2, P3, and P4 consists of one atom; clause P6 consists of * 
two atoms/Returning to the proof, we match the clause P3 and the 
first, atom in P6.by letting y be "ami." This yields the intermediate 
.result ^ " 

P6': If a hand is part of an.arm, then a finger is part of an arm. 
This, together with P3, yields the resolvent 

P7: A finger is part of an arm. . ( 

Similarly, the resolvent obtained by matching P7 and the first atom 
of PI is: 

. P8 : If an arm is part of y, then a finger is part of y. 
The resolvent of P4 and the first atom of P8 is : 

P9: A finger is part of a man. 
Matching this with P5 gives a contradiction of the denial, thereby 
completing the proof of Theorem Tl. This proof is outlined in the 
first and secqpd columns of Table 2. In this table the corresponding 

- . . ■ % 

Table 2 
Proof of Theorem 1. 



Proof in symbols 



Clause 
name 


Proof in h&rds 


\ 

■ \- Clause 
■> 


Reason 


PI 


-If X is part of.c and if o is part of/, then 


Pari (x,t>) Sc Part (v j)-* 


Given * 




x Is part of y. 


Part (x y) .' 


Given 


P2 


A finger is part ofa-hand. '■ ^ 


Part (finger, hand) 


PJ 


A hand is part of an arm, p 


Part (hand, arm) 


Given 


P4 


An arm is part of a man. 


Part (arm, man) 


Given 




^> finger is not part of a man. 


— Part (finger, man) 


Denial of 




conclusion 


P6 


If a hajiJ is part of/; then a finger is part of/. Part (hand, y)-+ 


r(Pla,P2J 






Part (finger, y) 




P7 


U A finger is part of an arm. 


Part (finger, arm) . 


r[P3,P6a} 


P8 


If an arm is part ofv, then a finger is part of/. Part (arm,/)-*- 


r[P'la.P7J 






Part (finger, y) 




P9 


A finger is part of a man. 


Part (finger, man) 


r[P4,P8al 


P10 


Contradiction. 


Contradiction 


r[P5,P9) 
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proof in symbols is self-explanatory, except for the following: The 
expression r[P3, P6a] denotes the resolvent obtained by matching 
the clause P3 with the first atom in P6> The symbol & means 
"and," the symbol—* means "if then ..." or "implies," and the 
symbol — means "*iot." 

Example 2. The resolution principle consists of only factoring 
and resolution. This example illustrates factoring, disjunctive nota- 
tion, and the way in which the theorems are> formulated for appli- 
cation of the resolution principle. 

Theorem 2. In any associative system which- has left and right 
solutionis ai^d t for all equations, s ' x == y^knd x • t = y, there is a 
right identity element. 

Pjoof. Application of the resolution principle to produce a proof 
for this theorem is outlin^) in Table 3. The .notes given beiow 
elaborate on that proof: / 

Al: There exists a left solution. This moans that for/all x and y 

there is an s such that s • x^ = y. In other words, there exists 

a function g(x, y) = s such that g(x (j y) • x — 
A2: There exists a right solution. This means that for all x and y 

there exists t such that x • t = y. That is, there is a function 

h(x, y) as t such that x : h(x, y) = y. 

Table 3 
Proof of Theorem 2.,/ 



Clause 

name Clause Reason 



Al g{xj}x-y - . ' \ Given (existence of a left solution) 

A2 x -Hx.y) - y " Given (existence of a right solution) 

A3' (* y - u) Siiyx - r) & Given (part of associativity) 

•■ (x'O - w)-»(u'r • h-) i '* 

A4 k(x) x tk(x) /Dcnia] of conclusion* 

A5 (x y-u)SL{y / - >)-(ux - u) \flA3.a.c] 

A6 Myz-y)-+luz- «) r[Al.ASa] 

A7 yzVk^ - r(A4.A6bl 

r A8'"- Contradiction * r(A2,A7J 
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; Establishing associativity. Iji *tMs £ jiil^ 
'■■vfM x (y • z). Actually; we neejd fciidij^ 

; x...: (yv: •?) * fedK^^'^ ; • V,,,;, 



m 



Hence; 



^ < (x . ,y ,u) & (y • ^'v),<& (x j^4= w)^ (y z = 

A4 Denial $frconciusio^ 

there is iib ri^ht idehtity eleh 
v ^entity element ^ 

^"u. -^i other words, : ; tWe\^M 
*v ^hat k(x), : x is not-equal ibj^ ^vl 

^ In thS proof givwiri Table 3, clause>A5« 
*f>y "factoring" ' A3 with respect its! fijfc 
; y clause A3 implies its special, case AS^andj^ 
v a f$£tor of A3, JTWs Jfcctor . is pbt ainfed i]by 
aiid^hird atb;ns bf A3, namely,. x • y * t <\^^ 
t u t^d 63 r , v,' and u repl a^es w thrbyghoujj^ 



has become idehtical to t the fast ^and therefwp .'rfeduri^iM6^p5|* 
' atiori£ we see thaTit, g&^nei(al; fa^&HrtSM^ 



-celed. 'From these opferatiorip 
clau^'. witffrr&Dect to '. tfa&t more d|y\yp^ 



*kfp caitionj 6f Tabie 4 : lnUe'$mer^^ 

4 * j/ logically aieqoivalejii fo . the „ ii^^^bjni^nptaUo^ iiQ^^'^p 

,./ y • programs embodying his Ves|)lutic 
|v ••*••:.•; >;'di^\esilisjunctiop.)^ : V'^v :'••*' i — 

ToMllustrate.fheidea of minimal substitution, consider tfit pro- 
V * «ess?i;6f fa#ofi^^ wyth respect tp its first twP 

W 'M^f^-^M^I^^l v. P(y^ g (x)) v, p(w, f(y)). -'; . > 
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\Ctquse 
.mime 



Implication not of tort . 



Dfrjuncjke hotot ion- 



Reason 



: ';;A1.. 
' A2 
> A3 

A6 
A7 
A8 



Piii**;*) -/ , U,r.w)V/»(w 1 r # ^) 
rPW*W(»» ' -«*<x)*atf (fir)) ; 

Contradiction Contradiction 



Given (existence ofa. . 

left solution) 
Given (existence ofa 
... /.right soTxii jon) 
Given (part ofaj- - 

'Relativity) 
rjenijat of conclusion 
/[A3,a,cl /• > 

r[AI,A5a] 
r(A4 f A6b] 
r(A2,A71 " 



We try simply to match the two atoms from left to right. The first 
required substitution if f(wj for y. (Of course, the r^versej i.e., substi- 
tution ^of y for fl[w), would.; not be valid.) Substituting fl(w) for y 
throughout the clause yields ' ' . ' v 

' P(f(w), w) V P(f(w), g(x)) V P(w, f(f(w))) 

The next required siibstittitioh is one'in which ^x^feplaces w /Note - 
that the nonmihimai substitution of h(z) for x and g(h(z)) for w^lso 
leads to a match, but. the, "factor* tfms obtained wouliibe ^ spwial, 
case of (and, therefore, worsje than) the factor jtfet ^e shall .9$^ 
/ this way. Substituting g(x) Ww yields * y / -v. ' . 

; V : ' ; . P(fl[g(x)), g(x)) V P(^x^ ,v ■ v ; 

' Thus, the Substitution of g(x) for w and>stnall fl[g(x)> for y in the 
original clause is minimal. The faitor/ obtained by deletihg^one r of 
the Mb redundant atoms; is : : ,^ . ■"' *. •'' -••"" ';/" 

If the above-mentioned nonminim^l substitution W^re used^ the 
I "factor" wpuld be as indicated Above, except that,h(z) would; 66 
substituted for x throughouVthe^ctor. It .havens that . there are 
no other jfectprs of the of igin^ third atom cannot 

-bp matched? with, qkher the first or secbn^ atoms. <n v ' ^ , ■ 



■2^':;;; : 
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V ; Example 3; This example 

trates the minimal-substit^^ aspect of resolution, The list of 
arguments for'Bl:ahd'B!2cah%e;iria'(ie to match with.^each other by 
implementing minimal substitutions. Accordingly, ea^fclist of argu- 
ments becomes: • ' * : 

••• ' ' / - •• . ' s, ■ • . ^\ . ; " • ■• • / ...... .. 

. V '. • ^ g(s), * ' .... .: 

m(g(s)), * ■•: .. ' 

, h(s, m(g(s))), 

. ;n(g(8); h(s, m(g(s)))), / ^ ; 4 • 
. : ;k n(g(s), h(s, m(g(s))))). I , 



4.2. Definition of the Resolution Principle. A more precise defini- 
tion of resolution and factoring now can be considered Using 
^disjunctive notation,, the resolution principle combines the follow- 
ing two basic ideas: -v • 



1 . The syllogism principle of propositional calculus, i.e., from 



'and;/; :v. : -v .. 
oiid may infer 



• . a V M' , 
<* -a V' c 



2. The instantiation prlri&pie of predicate calculus : from the for- 
mula . •. v . ' " ; : ; v.v'. . 

• ' y • ;. ,< ^">> •. : ;\^F(^ v 2 , ..ifvj. • ' ^ • '' :'_>;'•;'.; 

. . • •• V . TABLE 5 /.. /• 

% .... illustration! of the minirrial-substitutibn aspect of .resolution. 



Clause 
name 



Clause 



Reason 



Bl 

*B2 
B3 



■ Contradictory ^. 



Given 

Denial of conclusion 
VfBI.B2] " 
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Avhich ts understood , to ;1- V 

infer thejormula • )' : ; /f ''" : / < \ :. ■ 

, • ; .'/:;•■ ! i? ;;^^-- : ' tn) :'. "C'^-i ' , . ' , - 

Obtained by subsdtutmg ^^|?rms" t t , t 2 , > t n for the variables "J; 
Vi : y 2 y -V-Vy^es^^ a term" is either: -#r 

(a) an ip|iyidual cprista^ . v/,- ' 

> (b) an;indiyidual ,varial?le ^ . 

(c) a func& g(x, v) and h(s, m(g, (s))). , 

•' .• .- • * : ..••)•': '. * ••<-:• ' . • . ,'-.y 

General resolution: i^jiari >f Jfie. bv^aH resolution principle: ^ / ^ 
disjunctive clause consists;^ ■ 
tion; a literal Js^an atbrrf^f tjffe negation (denial) of an atom; The >..,;, 
resolvent, if atfty'of a li%$l in one clause and a literal in another .. ^ 
d^se is im$jf f; j?yj tjke^t^q xjausjss tak^h together. Such a jesolr v : V£ 
vdnt is obtained as fpllojvs ,i A > a s 




, F The^s&i^hfe^ of * e literals remaining 

• ; • in the ^t* ; art^|(tf o^ v. . / .. * • . 

Genial 

"■tit^ by that • 

. clause and ig 0(3 tu^i^J^^s':" ; _ . K- 1 ' r^'C . ■ \ ■ . : ' 
; '/A.. Find, t ^^^^^^|jj^> s ^Jt^|ic>n , if any, whiclvmakes , 

'■ siitjstitutioi^ throughout the clause.' ^ 

- C^cel&ll^ 

■ V. ti^i . *■,/ 

v ■': "■ £ v : ' : 'h:?v y 1 , 0^1 -s * ■ ' ' 
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^ ^D/ The factbr, tHen,.is the disjunctio tf^^l^HB faininjz literals, 

4 J; A. Robinson (1965) proved thajy^fflBBl^ 

r- effective sound, and complete for .proo^iT^jung. Subsequently, R. 

ILee (1967) showed that resolution is complete for consequence find- ' 
V-ing^as well; Its effectiveness means that one can write a computed 

5 program, which, iri a fjnite number p£ steps, will ifind the factors of 
any . clause- and the resolvents of any two clauses. The principle's 
.soundness means th^t a clause logically implies each of its factoid 

andthat twb.«lauges, taken ^together, logically' imply each of their 
resol^^. : r / *:\ ':'* : ":. : \ : y .;V : -: . 

.'. : /;; ^<> ' ,' -.\- .', ' . - . 

I Theorem. If a finite set of clauses is unsatisfiable, then a finite 
number Qf applications of the resolution principle will find a con- / 
tradiction. ■ • v 

Theorem. If. a -clause G tea consequence 1 of a finite tfonenfrpty set 
of clauses, then-a . finite number of applicatibps of the resolution 
principle will find -atfclause T Such "that C is an immediate conse- 
quence of T alone. ^ ; !• " - 

Several researchers ^have strengthened these theorems by show- 
f ing that certain restricted fpnus of the resoliition: principle ar^still \ 
complete. This is of practical importance to automaric theorern^ 
proving because research results have indicated that r|stritted T (yet 
stiil 'compliete) resolution, tends to' be more- efpcieiU^ thaii ^unre- 
stricted resolution. ■ v '' : 'r- 



5.. OTHER PROGRAMS FOR MATHEMATICS ; 



I 



In dddffu^ in calculus and 

^geom^i^^^ of heuristic - programs 

! %r;Aa&^ include a successful regression 

analysis ^gram^ 

plicated syitil^ expressions. These are excellent ve- 

hicles for trying out Wtificiai intelligence algorisms' and techniques 
since the ferns' are\ well l^rmulated. In- "addition, many re- 
searchers°a|eflnterestednlit eventually pbtiaijaiiig a practical program 
suitable for "developing desired sdfetibns to Realistic m^^matical . 
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problems. The additional examples outlined in this section are rho- 
tiWted by both of these major forces. ■ ' 

; 5.1. A Programmed Aid for Manipulating Mathematical Ex-- 
pressions. Several ^searchers have written programs to help people 
manipulate mathematical expressions. WilliaSTMartin (1967) and 
Joel Moses (1967) have written .one such program for the time- 
sharcd'eomputer system at MIT. Among other things, the program 
can simplify, differentiate, and integrate mathematical expressions, 
and it' cart solve some simple differential equations; In a typical 
episode-: the 1 human user Submits some Jmtial Mathematical ex- 
pressions, along with directions for their manipulation. In response, 
the program produces intermediate expressions resulting from the 
classified operations. Then, the user d escribe s the way in which he 
warits those intermediate expressions manipulated. This interactive 
dtogue may go through an arbitrary number of cycles until the 
;.;ble^obt3ins a Set of output expressions that he considers to be in 
• final form. In this way, for example, the program c#n be used to 
simplify and differentiate very large expressions qyicHly and accu- 
rately. Its operating ^abilities Remove it from tto^jhss of being 
merely a replacement^fpriwork that had been ||||Wn^nually ; 
rather, it pin- handle routinely ^ 
plexify place them beyond contemplation for manu^ 

As such programs aref augmented with improved al|#i(h|^ifc; 
•suiting from more r^feht research, they become 'TJ^f^j^ 
assistants, thereby enhancing their usefulness. F6||^ ? 
pose s^pieone wants to, write a heuristic program^|| 
' some specified task. Instead of trying to 4p %is direj*^ 
,'- poss iblegtp describe the task formally to a -"taskiijjL ,,. 
'ant," wtiicK,:' in re<k°ns&$j|^ 

required task dir'e^i6^^P»tfeBabilities still are limited, 

they certainly are.;*$l$i||^ 
"sequences of det ail^^ 

Already are in usj^^pfh^^|l^fM^lik^ : lariguages;:;are.., f 
being developed to1fls$|j&^ , 

v 5.2i A Heuristic Reg^ floyd • Miller 

(1967)" wrote a practical heuristic regression analysis program that 
learfie&i Input tdthe program consists of a set of (n + l)-tuple's (i.e., _ 

1 ■l^^^^--r^M' 
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, Xi, x 2 , ,/;^X^^); Where n is no greater than 40. The independent 
. variables xj ar/the predictors, and the dependent variable y isihe* 
/ response; The othej f input item is k, which is either 3, 4, or 5 and 
subject to the consent that it does^not exceed n — 2. The pro- 
gram computes the model assdciat^Bith each of the 2n distinct ;■■ 
sets of k predictors. By d e fi n i t io n ^h'c Jmode 1 associated with the k > 
predictors, (i.e., x, p x, 2 , ... |Xi J i^ 

which provided the "best .fit" foj the sets of (n 4- l)-tuples. f he aj, of 
course, represents the multiple regression pq^filcients,* ©utput gen- 
. erafed by. the program cpnsists of the three best models th^pri)- .1 
. duced. The user often isjrftcrested primarily in the three good scts 
ofk predictors which he obtains as part of the models. Once a set 
,;,']!': of k predictors is chosen, the program^ applies standard least-** 
f&'-- squares procedures to compute the multiple Tegression coefficients 
aj. This latter aspect js of secondary interest since it does not 
depart seriously from traditional^ methodology. Of greater rele- 
vance in our present context, however, Js the procedure whereby 
| J V , the heujistic regression system learns'to select good predictors: 

J A. Initialization of the run includes input of the observed data. 
J" t r, •. (i.e., thb set of (n + l)-tuples). Assenting no prior knowledge 
about the relative t^bfulness of e^ch Qf the ri predictors, the 
• V : A i: initialization' process assigns equal pvobabjihksfoi usage. 
.!> J Thus, each Pj == I/nowhere. Pj ia the- probability of choosing 
. •■ 7 ,:,;^ $\ ■ ■ . predictor Xj in step? C/ . - ■ : ;*;V: v .* ; v f^'w^y. , ■• 
■■0^0 B. Initialize the trial processing^. each frialy a setof 1c.;* 
*\.; .-. predictors is chosen. .; . : .*; 

C.-Choose a "predictor in accofd^ii(^i^|th ..ftie rfelativ.e prijjfea 1 " 
..bill ties Pj V .- ,. ^V;.-- • ; ><l '" ■> "V: v ' 

. D. if the selected,: predictor lalready*f^^!!peen chosen during f his ' ; 
y trial, discard^Kis choice ^^b* tcugg^ G/.'. .^V^^.^ 1 ;;':.:-^ . 
/ , E. If a set of k predictors hais fi^ yet^B^ ' 
F. If the current set of k predictors is the same a s t fi^ s e tiu secT i n' ' 
some previous trial, discard the set and, go to C. v - '< 
. G'. Use regression analysis to find^the model associated with the 
. .. [$' set ofk predictors. '% ; . 

H^Savdthis model if it is one of^he^thj;ee best modelsi^SMbd 
- . ^^so fat/during the run; , . . . ."KV- ^ 

-/Vv .: : - J • . ■ V-;.': • '..l^-^te^ 

^■:'y* j : ' ^Si . r:..V 
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I, Reward pnd punish each of the predictors, That is, adjust the 
probability Pj of chbosing predictor Xj i (Space prohibits de- 
tailed discussion of these techniques.) 

J. If fewer than two n models have been generated; gcf to step 
B ; otherwise, print the three best models and stop. 

The performance of this type of program is excellent. Since its 
inception in 1964, it has solved well over a. thousand different 
practical problems at various industrial companies. (Examples arc 
described in articles by Blackmore et al. (1966) and Drattell ^^p 
Applications have ranged from production planning in the {J&int 
industry to the analysis of traffic fatalities an Florida. v 

s S3. Implications of Mathematical Problem-Solvers. Progrc& in 
this area of artificial intelligence is sufficiently striking ^tcy^cb^the. 
efficacy of mathematical problem-solvers well beyj^^ 
these systems produce useful solutions to probleni^bm^ 
teliectiially nontrivial in human terms, there ard: grounds for a 
strong argument that such programs meet most t raaitional ;criteriar J 
for intelligent behavior. - . - 

6. A PROGRAM THAT FlNDS^pHEMICAL -STRUCTURES . ^ 

.Heuristic 'Dendral is a program designed by Edward Feigen- 
:vvbaum;( ; 1968) and-qthers at Stanford University. Input consists of 
L . "fe mass spectrum of some acyclic or- 

ganic molecule. In response, the program produces a^^9&^|p^^!'' 
tural formulas {i.e., molecular. graphs) that explain the gi^n; innput 
; in the light of the program's model of toe mas^ and 
stability of organic molecules. The list is ordered-s.tartirig With tbg^ 
most saiisfactory explanation. Comparisons always^are inj^estir^ 
* In thisffisbnee, for certain classes; of organic molgsirf^the pro- 
/gram's performance (i.e., its spwd .and accuracy) approaches or 
■ exceeds that of postdoctoral laboratory workers in mass spec- 
trometry. * t . 
^ Heuristjc Dendral consists of four basic' processes: the prelimi-, 

jf 

tioji heuristic rules: It makes .at ^reliminary 
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tcrprctiition of the data in terms of the i^t^cpcc of key functional ; 

groups, absence of cAher indicator groups, Weights of radicals at-_j 

tabbed to key functional groups, and so on. 

These activities pave the way of the hypothesis generator, a com-; 
poncnt that "knows" the valences of atoms arid capable of gener* 
* ating. all of the- topologically possible isomers lor the empirical 
formula submitted as input. In a very real |£nsc, then, the 
hypothesis generator and the empirical formula determine an im- 
plicit tree. At the top node, we havc^ll the atomsJbut no structure. 
^ At the terminal nodes, then, there arc complete structures but no 
unallocated atoms. The search within this tree is guided by various 
heuristic rules and chemical models such as the output from the 
preliminary inference maker, the a priori model, and tij,e zero-order 
theory of mass spectrometry. The a prior^ model is a'^odcl of the 
^ chemical stability of organic molecules basdB bnMjiip; presence of 
. certain denied and preferred subgraphs, of the chemical graphs. 
Zero-order theory is a rclativSJy crude but rather efficient expla- 
nation of the behavior of molecules in mass spectrometry. It 
screens put entire classes of structures because they are not valid 
with respect^to the data, even within the latitude tolerated by a 
crude approximation. Output, from the preliminary inference maker 
^/ and hypothesis generator consists of a list of molecular structures' 
that our candidate hypothesizes for explaining the empirical spec- 
trurn. This generation process embodies a rather complex^theory of 
. ■ vi /mag5. spectrometry. The eyaluator is . a heuristic algorithm whiplr^ 
/ '\;'"^^^hesL'thtf predicted spectrum for each candidate with the empiri- 
# ^ previously. After discarding some candi- 

, dates, U^^piaj^lhg remainder as mentioned earlier. More recent 
V enhanccmcrrts ; to this program include nuclear magnetic resonance 
data-as part of the criteria for identifying candidates arid evaluating 
. .their suitability. 

1. DATABASE MANAGEMENT SYSTEMS AND ARTIFICIAL INTELLIGENCE 

* i v A typical database. management system consists of one or more 

large computer programs that can make additions, deletions, re- ' 
V triey$s^V$nd modifications in a large collection". of data. Such a 
; ; T systcrtj^iin answer a qbestion submitted to it * by displaying facts 
■;;extr^t^ data-cpT^ectibn. More sophistic^^^stems can 

- : ^ l 2tei; -;'v :: A^' V - . . ' . : : : ' 
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g^^W^WMlye^i^ liken such searches to foqWw u, ■ 
a rather dull-witted; lazy, anVcompletely litcral-minded VeKrehce 
(librarian. ' " : - >/'•'• .*:••»•.'•.' ', ' ,. : 

i One of Ihe most active areas^in cdnS^uter science concerns itself 
with the development of concept and methodologies' 'that will im- 
prove the quality and responsiveness ofsearches on database^sys- 
..tcms. A considerable jpart of that activity is relevant herebecaufe it 
seeks to apply artificiaUntelligence, techniques to the spe'tffication 
analysis, and processing of search requests. Anthropomorphizing 
again, the objective is to make the system more like an industrious 
librarian who exercises some initiative to help the user. One .aspect 
of this work seeks^jeplace the highly strutMred categor^ sensi- 
tive search requests :with a more naturatsyntax. For example, Wil- 
liam Woods (1978) at ^lt, Berane^nd^marf ; wrote a p^ram ' 
called the r Lu.nat SYltem, a- natural language system for retrieving , 
^formation from a database of chemical analyses of moon rock 
Samples The interface' |E^ tKe user ahd the system is equipped 
^9 handle inquiries such as^'How many samples contain more than 
-lO^rcent iron?" Improvement in ^ sucNsystems often ^manifested 
in their ability to . accept .less and less^jrecise inquiries. ^David J 
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University of Illinois, sue- 
'" ■Arc there any common 
month?" Of course, the 
production of trivial or 
I metallic," or . "All the 



fig to make;:nf\prevahd. 
jresporises to -search ; rc-; 
Ims; are moving qut^f- 
tronnient. For example, 
Istems whose users can 



/t Language 

ine instance in a wide £ 
to communicate with 
hile there are many 
|ttempt to approach 
c that imitates a very 
[st majority of man- 
npromise on the part 
ig problem continues 
[and ambiguity. Any * 
l^ith it an implied 
Ifcd by other people. 
Iputers, the difficulty 
grterpcetive capabili- 

_ brief mention of some 
pis of efforts to improve 
!man on his ,Qwn terms. 

■'^^^ch'Wdw^^^^^ idea*f being able to 
sf^Ta computer is tantalizing. Speech -j, man s^osynatural 
communication system. People can speak faster 
tor write; they can sjpeak,. when thjv are ..M^gS" 
their hands are busy, Consequ^y, « r ^^f""J^ 
would malce a tremendously convenient terminal to a comber 

that understands speech. 
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Under the sponsorship . of the Advanced Research Projects 
Agency (AR PA), Alan. Ncwdl M uU 1973)iqrmulatctf a:!!^^^ 
plan whose objective was to produce a prototype system capable of • 
undcr!5ta1\(iing ;-8'pccchV ,: ^Uhin H19 context of a limited domaiiMfty 
discourse, the system would be utile .to understand an American 
uttering ah ordinary (though somewhat simplu) sentence construe;-, 
ted from a one-thousand-word vocabulary, the speaker, would 
enunciate in a natural manner, with no extreme regionaf dialect. 
Two of (lie live original participants (Carnegic-Mcllbn University 
and Bolt, Boranck and [ Newman) still. arc in active pursuit of these 
objectives. v; , , v • 

These prototype systems base their processing on acoustical , 
clues* syntactic context, and semantic cofttcxlM microphone trans- 
forms' the spoken input sentence iijtb-a ^waveform of amplitude 
versus time. This, in turn, is converted into -a spectrum of frc? 
"quchcics versus time and the amount of energy in a given frequency „ 
band ultimately is represented by a gray lcvel,\Embcdded in this 
spectrum arc bands called formants. that can be tracked through . 
the signal. These formants^corrcspond to vocal track resonances 
. whose trajectories represent movements of the ydcal tra.clC ; C(5r.tain 
Vowels and diphthongs often .can be detected by mean^of. the 
formats. Unfortunately, a formant's shape, also is Subject to the 
influence of vowel context, i.e., the sounds that precede and follow 
the vowel. Consequently, the determination of "Spoken phonemes 
(elemental speech components) is highly contextrsensitiye. Because 
of this, these speech understanding systems take advantage, of the 
small size of their vocabularies and use context heavily in identi- 
fying phonemes. ■■• ; 

In fact, this is not fundamentally different from the 'type;of pro- 
cessing pcrformpd in human speech' recognition 1 . On isolated . php- 
nemes, experienced speech scientists choose the correct ones less 
than 75 percent of the time. However, when given entire sentences, 
their accuracy exceeds 99 percent! flft. ~: 

■ . , ' : . ';' ; ; . " ■;■ *■ ■ ' ; • ■ ■ 

8.2. Naliiy Language Processing 7 Systems. Speech recqgnizing 
systems kre^lralleled by a much j larger. effort alpied at developing 
processes that enable a computer to understand natural language 
j, in written form. Roger Schank et air (1 9^5) constructed a system 
'^fed MARGIE (Memory Analysis ^Response Generation and In- 
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fcrcricie^n English), The system consists 'of a set of three pr^i^fe:^ 

(1) A conceptual analyzer accepts sentences in restricted .flAV' * 
and* maps them into fconccptual rcpr«chtHion : of their i^ianf 

(2) The memory and inference program stores new inpUtff 
establishes reference, and makds inferences about the input J 
received. (3) Finally, tlyc generator program encodes conceptual 

. representations into symircti^ structures which arc then, mapped 
into English sentencesr V 

MARGIE opcratcs'in two mtfdcs; paraphrase and inference, In 
paraphrase mode, sentences arc analyzed into* a representation of 
their meanings, and then those meanings are written in as many 

• ways as are available to the generator for expressing them in En- 
glish. Since this, internal representation of meaning uses constructs 
that are not part of natural English, the problem, basically, is one 
of coding ideas into language. In inference "mode, the 'input sen-' ; 1 
tenccs arc analyzed into their,, meaning representation and then 

» processed to find out what other information might be true or 
might arise as a consequence of the input. Inferences are used to 
motivate further inferences. These, in turn, are transformed into ' 
English and printed for display. 

The processing, is rather massive, so that the response is slow in 
human terms. Forexample, when the system Executes on a million* 
instructibn-per-second machine at Stanford's Artificial Intelligence 
Laboratory/the response time in paraphrase mode is a few se- 
conds; when operating in inference mode, the response is measured 
in minutes. 

We can develop an inkling of the system's operating level by 
looking at a processing example in each mode. In these illuStra- ' 
tions, the hiiman communicanLwas familiar with the contents and 
extent of the program's vocabulary. 

^. 4> Inference Mode 

^^nput : John told Mary that Bill wants k book. 

Output 1: ^ book about what? 
3r.' Output 2: Mary knows that Bill wants a book. ..." . 
% Output 3: Bill wants to come to have a bSbk. 

Output 4: .Bill wants someone to cease to have a book. 
Output 5: Bill wants.tp read a book. ^ 

■" -.1*.-.*- t&> * ■*■ ■'* ' ,v ; 
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Input: John gave Mary a p aspirin. 1 

7 Output i^jlohn believes that Mjuy wants ajj aspirin. ; 

• Output2: Mary is sick. *> 
Output 3: Mary wants to fecj better. 

Output ^4: Mary will ingest the aspirin/ \ v f T S /j, 

* .Paraphrase Mode *" : ' 

} ^ ;WpUt;U John advised Mary to drink the wine/. 

^Uutputl 'john told Mary that drinking the wine- would benefit 

•v-'- " \ her/ • • - , ., / ■ 1 ' ■ '\v 

, ;. --^4^^^- ' ""'T***'^* 
^ Input : John prevented BiH from giving ^pma to J^Iary by 

V : ; selling a banana to Rita. ^ v ■ > 

Output 1: JJill was unable to give a 'banana to Mary because 

' Rita traded John some ^nohey for a banana.; 
Output 2: Because Rita,. bought a banana from John, Mary 
~ could not gcta banana from Bill. 'v' V A 

MARGIE is; of particular significance because it represents a 
fundamental departure from earlier natural Janglfage processing 
systems /which were bated on a sjtojed inventory 'of-wtirefs and 
grammatical t rules. The basis of this system is conceptual^ 
dependency theory. Iff essence, this theory contends that naturjpt 
language has an underlyihg meaning structure which should be 

■^ed for all pertinent processing. The claim is that people thipk by 
using this meaning structure which is independent ofva language's 
specific words. Schank^yl. have proposed such a structure and 

, used it in MARGI&for representing the output of a meaning.ana- 
lyzer. It also serves as the basis qftinferentoand memory programs 
The structure requires thaVarty^wo sentiences having the same 
meatiing should have only one representation^ An important com- 
ponent of such a structure is a set of primitive semantic elements 
into which ^votfds with complicated meanings cao be mapped. At 
this fundamental level of meaning, concept's combine into concep- ' 
tualizations. A conceptualization, is a statement about an actor 
performing an act; an actor, in this context, is either an animate 
object or a neural force. Correspondingly, a physical act is per- * 
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; ' : ;-:V-v ■ ' ' ' ' . ;V* : v'v''- ■ ' 

^ v formed » on a physical object while a mental act is performed -*pn^ 
merital Object, i.e., a conceptualization. In this context^ for example, 
: , V.^url^'-JsV'not'j<f;bnsi^rl^ an act. Rather, it makes reference to an 

unknown act that, results in a "hurt" stafe for the object acted/ 
■< >upon. By the^lime reasoning, •"provept"«is not am act either. In/ 
: stead Jt is relational between two act^ / 
This formulation of the Mature* oPm^Miing structures allows the 
• Creation bf a set of basic -primitive -'acts, each of which is defined by 
the inferences that are true when it is present. A given Verb is 
represented by a ccWjMnation of primitive acts. The current system- 
y uses, eleven; 9 such', a move, grasp, expel), '^n other 

primitive act is "atrans" ("give* is an instance of atrans)/ T^is 
primitive act^ requires an actor, an object, and recfejont cortsi§ting 
of a source and a. goal. The source and gem) iriust fre animant and 
the object must be physical. « V** . J / "\ - 

; g Even from this superficial overview of the basic ideas, it is clear 
that MARGIE is an attempt to simulate the processes involved in 
• understanding natural language in a nbnartificiaLr(i.e., concept- 
based) envirorimei}t. Here again, as we have seen' in^ the case of 
garrie-playing and theorem-proving programs; the>Vehicle itself pro- 
vides a stimulus for further study about the way we interpret and 
process ideas. / ° , ■ : * f 



.9. CONCLUDING REMARKS 



- Based on earlier discussions of a number of heurisdc programs, 
we now,, are in a 'position to examine such programs more ab- 
stractly and ask'general questions-about-f them: Into what catego-^ 
ries or aspects (Jan heuristic programming be divide^? What are 
^the future applications of heuristic s pf b^ramming*? What are the 
philosophicalfand social implicatioi^ of fhe o advent of intelligent 
machines? / 

/■ ' / * > V 

9.1. Aspects of the Heuristic Programming Problem. The basic 

issues of "heuristic programmifig/an. be categorized conveniently in 

terms ofthe following six aspects: # ."' ■ a • • ; ; 

(1) generality "f / — 



(2) searching , ' / 

(3) /functions that make evaluations and recognize patterns 
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■". * 7 . ■ /. ■'.*'••■ • ; 4 . ■• . ■ 

- (4)* the matching of data Structures t^o determine appropriate 
V ' substitutions forv variables -i ' j £ ". " ' 

(5) learning i' - . ■ ' / , 

,(6) planning r , ' W 

A heuristic program is *sijd to ( b£ multipurpose (i.e„ general or v 
broad) if it can solve a wid^Yarietjncjf' problems or answer a wide 
variety dp questions, Since tt^e intelligence ' of 'a huVnan is multi- 
purpose, there is motivation in the-artificiial. intelligtence community 

p toward getting heuristic programs to be multipuf^se, eve^vif this, 
means sacrificing the solution of some difficult problems. The hope 
is j that once multipurpose programs can be written to solve 
relatively simple problems, these^programs v/ill be extendable to 
more difficult ones. There is serious speculation that such programs : 
ultimately^will- be equipped fo solve their j)jvrrprpb^ 
pie, a' program that' needs .to pfirfdrai a search will include an 
operational component that will define how the searchf' will be 

"conducted! * * '['".' • * , v *' - ; . • 

• The idea of heuristic searching ii applicable to problems that 
cannotvbe solved directly. tinder such circumstances, there are. in- 
stances (51agle^ v 49 ; 70) in which programs have been written to 
search for a solution. If the-number of possibilities to be searched is 
sufficiently ^small, the problem is trivial since ths program c^i con- 
sider all possibilities. For an intellectually MiffiBult problem, -liow- 

" ever,- the number of jpossibilitie^ is sufficiently large so > that an 
Exhaustive procedures not feasible. In most kiijds Of theorem 
proving, including, those of predicate calculus, the number of pos- 

.sibilitjjs to be searched is potentially- infinite. Consequently, it is 
much better ^"consider alternative" ways\f defiriing'and modifying 
the search. In fact it often is, desirable (aqd even necessary) t& 
replace a searctr procedure! guaranteed tg work in principle with an 

- alternative procedure that is nofauaranteed but is good in practice. 
<f This happens/foi* Sample, w heir a search prgcedure examines only 

the top two levels of a. game tree rather than.jhe complete one. 
After being- modified in this way, a-, search Can sometimes bS re- 
placed by a more efficient equivalent search. Thus, a depjh-first 
minimax search may be replaced by an alpha-beta search. Wher- 
ever they can be identified, it is desirabte to search the most prom- 
' isingpossibilities first, thereby gJlowin|:alpl^*eta cutoffs ta occur. 
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In general, then, it is sa/e % -to predict tfiat improvements in hatd-. 

* ware perfo^ they may be, will not bring exhaus- 
tive searches tfnd other b/ute-force approaches into the .realm of 
practicality. Consequent!};, thasix aspects of heuristic programriting 
listed above '^ill pJiay significant roles in4he formication and im- 
provement, qf methods for reducing the number of possibilities, to 
be examined in the pursuit of a desirable solution. * ?■ ; 

9.2. Future Applications df Heuristic Programming. The iifipact 
06 heuristic programming is growing rapidly. Pertinent algorithms 
Of\d techniques h^ve been defined to a sufficient extent so that there 
already is a trend away frpm game playing an4.the solution of, toy 
problems toward the solution df , problems hiding jeal economic 
hpd social .value. 'Many soTutibrn methods, which were technically 
feasible but not cost effective, in,the^past now find everyday use*in 
business, industry, and .'-government because of the rapid decline in 
computing costs. . : ' >•'•/. ;j \\ ■ ~ \- ■ 

■ Robots already are a reality,-and many ^ of thejm afp driven by 
heuristic programs. Ong^nally designed fof use in hostile tfnviron-> 
ments (e.g., n6 .oxygen* high radioactivity, or inhospitable tempeiv 
atures and pressures), improved heanstics have helped make thenvf 
Profitable, in a wider vari'ety of contexts. Consequeiftly, such Tobots 
are,,i)erfortTiing complex assemblies ahdjinstallations in the aircraft , 
and automotive industries. ' »\ 

• 'Continuing nrirtjaturization of computers has made it possible to 
sever the physical connection betweeifrrpiipt and computer,' so that > 
pro^cts oIjl mobile robot* (with th^ computer as an intrinsic 
coniponttit) are realistic. For exampje^ Jbhn ( McCarthy and his 
Colleagues at Stanford University are t doing? research, oh a self- 
guided cart^ designed (ultimately) ' to^ trtavel unaided on existing 
roads. The" Beast, designed by George C^ljon, John <g. Chubbiick, 
and others ^at the Applied Physics Laboratory of Joljns Hopkins V 
University; is equi^pped^with hardware logic; and stqeriflg, and it has \ 

t tactile, so naiya,nd optical* apparatus as weUi This combination of 
CacilitieMQptivated by hfeuristid programa/enables the Beast* to find 

- its way down the ctyrtfec of a T Jiall; Wtlgn its battery -becomes suf- 
ficiently run-down, it**looks"^6r (i.e., optically locates) an flectric 
outlet and^lugs itself in to recharge its toattery. Tte-hejiristics for 
locating suc1h v outiet^ have been developed to a sufficient extent so* 
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that the system Was able to "surviver in a building of halls and 

• offices for periods exceeding 40 hours before it "sWVed" as the 
result of its inability to find another outlet. i 

An assembly robot developed by Ambler et al, (1975) offers some 
interesting insights regarding future developments in robotics. This .* 
particular system* using connected, overhead television cameras, is ; 
able to assemble objects such as. toy cars and boats using part 
descriptions developed from <the input signals supplied by the f 
cameras. Production robots start with such predefined descriptions 
as part of their "knowledge." With part descriptions^ thus devel- . 
oped, in conjunction'with predefined assembly instructions, this 
-robot is able'to.sfelect ancl assemble the appropriate pqrts from a • 
larger heap containing a mixture of required and unnecessary 
parts.-' '. ' '.• v '-. • 

An area of artiheial intelligence in which research Results have v 
been disafjpointinjg is that of automatic language translation. Thus , 
far, the idiomatic nature of natural languages and' their great sensi- 
tivity to geographical, historical, and other cultural contexts have 
been serious obstacles to the design and implementation of suc- 
cessful (accurate "Sft'd natural-sounding) general translators. How- 
ever, this work has' produced innumerable valuable insights with 
regard to the Underlying structure of natural language, and these 
discoveries haye been instrumental in the development of highly 
improved programming languages and language processors. Man^ ^- 
of the heuristics that drive today's natural language database query $ 
systems stem from the work in automatic language translation. " 
Whife the opinion is not universally'shared. many computer scien- 
tists working in artificial intelligence are convinced that continuing 
progress in heuristic programming and in our understanding of 
meaning will lead eventually to an effective natural language trans-* 
latirr^ system. s . 

• Earlier in jthis article, mention was made of the power of heuris- 
tic programs as evaluative vehicles. Instead of merely, embodying 
the operational characteristics of a particular model, iuch heuristic 
systems.inqlude a description of the model itself. Consequently, it is 
P9Ssible Jo simulate a particular system much more realistically; 
moreover, much more realistic systems c&n be^ simulated. Accord^ 
ingly, it will be much more helpful tp' use &uch systems as aid^to 
defining plans and strategies in a widening variety of endeavo/s. In 0 
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" addition, th^se systems already are proving to be effective; teaching 
" ■y^hki^ One of tlieireasons for this e/IectiVeness lies in tfe fait that 
Jtjie student • can v participate in the model as player or problem 
solver/ .By 'learning the^&pntent of the model, the student gains a 
deeper understanding of the fiompjex system- under study/ • 

"'.'^V- ,. 1 7 y*'^'!.',\ 1/ / v'v." "V ' : ■ . 
V-.'- 9.3. . Implications of Intelligent Machines. The prospect of really 
.' intelligent rr>achihes.a|fead^ hai tremendous philosophical *jnd 
; social*implications. In philosophy, the presence, of sudh machines 
will shed light on mephanis'm, the perennial "mind-body problem" 

• ^n& .perhaps even the role^ o/ man in the universe. Ih^tself, the 
•'■ ^isten^ji: .of? 4ntCliligent : '^hachines .would bolster the claihns of me- 

' chanist^hat man is nothing but a machine and that the answer to ^ 
s^;' the mInd^body ? p.fQl?ienjJs that there is only a body and nothiYig 
$ tha^can be called of develpping* 

1 V intelligent" qiachines, certain jmtrinsicjiifferences 'between mah and 
machines rAay-or may not reveal theftisclves, and this will consti- 
tute evidence for 'or against Mechanism, Many people claim'Sthat 
sach differences already are apparent They argue, for examplft, that 
* / a computer, Ban do only what it is told to do and that people can 
■ do m.ofe. A Mechanist 'would counter this contention by saying 
that people can do* only what they are told to do in the same sense 
that they, too, are operating under* a s o etjaf .imposed restrictions. 
That is, m^n's. heredity "tells'* him What to do, including how^ tb' 
learn from his environment. Argumentsjhat "sliow" twat a machine, 
cannot' in princijple ^e-asjntelligent as a marf a|e equkljf/debatable. 
The question is awaiting final proof or refutation. Presence of intel- 
ligent machines will compel man tojace the idea ihat he is not the 
only intelligent creature. The effect on man's image of himself will 
% be evei> greater than-the effect of the realization that man. inhabits" 
• a minor planet revolving around a minor star in a minbr gaUficy. 

Perhaps'man's most cherished claim tcT' his uniqueness ijj that his 
» intelligence cannot be matched b% a mere machine. 

What are the social implications of intelligent Machines? Will 

* eVen our ihost_skille<i workers be displaced by the <new auto- 
' mation? What will be the* impact of intelligent macjiine$> on t,he 

right to privacy? Wi^at doe£4his aiithorb<^leve lies in the dist^t 
1 future?- ■ . x ? ■ ' ,. /•"-. " \ 

The computer will be^ur slave and, in a sense, our. brother. It is 
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up to us to guarantee this ^y taking propcj prccaiitipns. Without 
such precautions there is . Some danger ( that intelligent n?uchincs ! 
-eventually Will "take aycr/' Naturd builds into each humaif a set of 
•primary goal]; or desires, it* the; jinalogy holds, then we arc in a i 
position, to builcf such goafs into a highly intelligent machine, j 
Hence we mqst be # careful that these objectives arc congruent with 
the welfare of ^^umani'ty, Sfmilarif^: we must bf surc-that ho private , 
individual (or grqypj.^can '^"subvert" intelligent Vnachincs toward 
•purposes that arc to thc/dctrliticitf of society as a whole. It will be 
relatively epsy t5kak<j these pfcca&tions if enlightened people learn 
the capabilities arid limitations of highly intelligent machines. A 
computer will be*our btothi?r U^e seu^e that human and machine 
will wprk.togethat to solve problems. ' ^' 

The cqmputctf-m6tiva^^^ is well known and 

Well docnmwtedV^eqtdfiiig no further discussion here. However, it 
is worthjyhjlc'to point out that intelligent machines using powerful 
ficuristics* have intensified th^thi^at: instead of merely extracting, 
summarizing, and 'displaying data about an\ individual, such 
* systems are perfectly capable of/deducing "new' • facts from existing 
ones. This danger, once recognized, can Readily be met by taking 
precautions-similar to those already mentioned. * ' v ; 
,;' As is' usua] with^ technological change; the development of intelli- 
gent machines Wfl lead to, a mixture of benefits and, dislocations. 
S$ : '%, automation has taken away jobs at the unskilled and semi- 
skilled levels; causing particular, problems for certain segments of 
the labor force. With the development of very intelligent machines, 
even highly skilled .workers will^e displaced. There will/6e*a need, 
then, to reexamine the "Protestant ethic" that hard work)is good in 
itself. 'Mapy people will Be able'to transfer their energies to social 
service wdrk; others \will need t<3 adjust their lives to a much wider 
spectrum of leisure. activities. /■ " \ ; . 

Of course, no individual can pcedict the (utu^e. However, I api 
motivated' 1 to make the Mowing predictions: Before thq, end of thfe 
century, computer-based solutions <gf intellectually difficult ^prob- 
lems will pljfty a dominant role in bringing enormous material pros- 
perity \d the world. In less than a'century, computer systems will be 
K making ..substantial progress on the solution of .social problems, 
■ including the overriding probjem of war and peace. Tnen, at last, 
th£ world may be able to live in peace and prosperity. 
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1. INTROD' 




Numerical Analysis ^concerned with determining specific nu- 
merical values for the variety of mathematical entities which arise 
'as solutions of real physical problems. It began with techniques 
directed-toward hand computation 2 and has assumed ever increas- 
ing importance with the \lastta increased computational capability 
offered by modern compufcrsA ' 

Classical Numerical Analysis^ often subdivided into, the pro- 
blem areas associated with interpolation and curve fitting! solution 
of equations, with simultaneous linear equations and matrix opera- 
tions forming a separate category^ numerical differentiation; /nu- 
merical integration or quadrature, and the solution of ordinary/and 
partial differential equations; Overlaying most of these area$ are 
the techniques associated with the evaluation of functions. 

Depending upon onp*s point ^pf view tjie computer has had little 
effect on the field of numerical analysis or jt has had an ovenyhelm- 
ing'impact. One can argue that^the eff^jt has'been minor in that 
many of the numerical methods in use today originated several 
m hundred years ago. These methods still bear the names of tlie great 
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men of mathematics— Newton, liulcr, Lcgcndre, Gauss. The me- 
thods .were devised for pcncil-and-papcr applicajlion; and the actual 
steps involved were addition, subtractior 
sion. These arc precisely the same open 
arithmetic instruction repertoire of toda 



, multiplication, apd divi- 
tions (that foVhi the basic 
's difl 



ital computer. Thus 

.interpolation by Newton's method or Gi^Ussiap integration can be 

manner accurately 

paralleling the hand computation that might have been performed 
a hundred years ago. In this age of computers, the core of numeri- 
cal analysis is still the set ^f methods devised m the great minds of 
an carjicr era. The computer has not etiar/ged the fundamental 
basis of numerical analysis. 

On the other hand, there arc several rcsWcts in which the digital 
computer has had a profound influence on the field of numerical 
analysis. The most conspicuous of these is^hat numerical method 
arc now being used on a scale urtthinkabf^ in the days ofhtfnd 
computation. For example, the numerical methods for sojvifig a set 
of twenty linear equations in twenty unknowns, or fofTmding the 
eigenvalues of a ten by ten matrix, have bcc^kpoWn for centuries, 
One shudders at the thought of actually^etfing out to solve -a 
problem of this size using hand computation, afid it is probable 
that such- massive computations wefc seldom, if ever, attempted. 
With today's computer, howeyci\ problems of thisVijagnitude and 
of much larger magriitude^ire, solyed routinely. \ 

So widespffcad is tj^tfse' of numerical methods in computers that " 
in many .cases thepcrson using the computer callk upon some 
' sophisticat^-riumericai method without even being aware of doing 
so. A ^o^ramming language such as FORTRAN allows the user 
to^slTfpr sin(x) or 'exp(x) by name, and causes the computer to use 
^fChebychev expansion or Pade approximation to compute the 
value when needed. All that is required of the user is alblind faith 
that when sin(x) is requested, the computer will produce \he correct 
result. 

One consequence of the massive application of numerical me- 
thods has been the development of an enormous library of variants 
on the traditional methods of numerical analysis. If one can iden- 
tify any special attributes of the numbers involved in a particular 
problem, he can frequently tailor the numerical solution method to 
' combine some steps to improve efficiency or to improve ^curacy. 
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\ 

Because computer lime is a commodity of value, the discovery of a 
more efficient computational approach can have a' positive payoff. 
Thus a considerable amount'of useful work in the fltfld of numeri- 
cal Analysis has gone into the discovery of special tricks that are 
effective only for limited subclasses of problems. These tricks might 
-well have fallen into the category of tricks not worth knowing in 
the days of hand computation, mainly because the problem solver 
might never find the time to address such problems anyhpw. Now 
that massive numerical problems arc being solved 'routinely, such 
special tricks or limited application methods have a useful role. l r 
the attributc\orthc computer that has had the greatest impact 
on numerical analysis is its speed of computation. The computers 
of today allow more computation to be dbnc in a single day than 
could have been accomplished in the days of Gauss by the entire, 
human race computing day and night for a generation. It is this 
speed, of course, that has allowed the massive application of nu- 
merical methods, with the attendant growth^n variety of method* 
mentioned above. .This speed of computation has^also had some 
other effects on the field 6f numerical analysis. One, which may be 
temporary and which may change as computer hardware changes 
with time, is a. shift in relative emphasis on methods of numerical 
analysis. In the days of hand computation; thc^usc oT tables to 
obtain values for functions ^was an important activity, and conse- 
quently methods of interpolation were heavily used. With cyrrcnt- 
genc^ation 'computers it is generally cheaper and quicker to com- 
pute function values fro'm a scries approximation than to store, a 
table and look it up. Thus the relative emphasis between inter- 
polation and scries approximations has tended to shift in favor of 
series approximations. 

- It is quite possible, however, that future technological advances 
in the area of high-speed, high-density memories would reverse this 
trend. It may once Wore become cheaper to store tabulated values 
and utilize interpolation more extensively. 

There is "a more pervasive, and probably more enduring, by- 
product of the speed of computation, one whichjs of central im- 
portance jn ^application of computers to numerical analysis. 
^I;rom its earlie$tJdays,'the field of numerical analysis has conterned 
^itself with questions of error propagation. So long as computations 
were being done by hand, however, not enough computational 
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steps could bo performed in most practical problem;) to allow error 
propagation to become a serious consideration. With computer 
speed, however, it is not at all difficult or time-consuming to com- 
bine thousands or numbers, or to take an iterative computation « 
through thousands of steps. In such situations the presence of even 
small errors with small growth rates can be quite destructive. For 
this reason the considerations associated with error propagation 
must receive} close attention in computerized numerical analysis. In 
ojder to address this, it is necessary to consider in more detail the 
computer representation of nunlbcrs and the ways in which error . 
accumulation occurs. 

2. COMPUTER RHPRESHNTATION OF NUMBERS 

A number is usually stored in computer memory in the form of 
electrical or magnetic representation of binary bits. The memory is 
subdivided into cells, or words, each cell being able to store all the 
bits t>fa single number. A fixed 'number of bit positions is used for 
each word of storage. Abftut 32 is the usual number, although some 
machines use as few as 12 or as many as 64. B^ausc of the ease of * 
representing birtary numbers by electrical or magnetic state, it is 
""natural and efficient to represent numbers in terms of some radix 
that is a power of 2, such as 8 (octal) or 16 (hexadecimal). Arith- 
metic circuitry for any particular computer is constructed to match 
the, number representation used in that computer. Ordinarily the 
arithmetic circuitry will allow- for two different types of number 
representation, integer and floating point. In representing integers 
as 32 bit binary numbers, it is only-possible to represent numbers 
having absolute value from 0 to 2'to the 31st power, or about 2 
billion. Therefore, floating-point representation is usually used for 
general .computation. In floating-pojnt representation, the number 
is represented by a mantissa, f, and an exponent e. The number is 
interpreted by the arithmctic circuitry of the machine as 

fx f'- : '. 

'where y the radix used in the representation. In order to stay on 
familiar ground in discussing accuracy problems, let us assume a 
radix of 10, although,>as already mentioned, a radix of 8 or 16 is 
'more common in practice. For a radix of 10, the standard floating- 

' 300 ! ' • 
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point number is pf the form \i ; 

TTie number of cligits used to represent f is fixed by the word length 
-of the computer. Let t stand, for the number* of digit positions - 
aliowed. Then every floating-point number stored in the machine 
will contain t digits, the number t. is termed the '■precision" of the . 
; floating point number, ' ~ " ' 

Precision is closely plated to the relative error achievable in 
representing a number within th£ computer. Customarily the com- 
puter will normalise floatingpoint ^numbers, so that there are no 
leading zeros. This is done by shifting the mantissa and adjusting 
the exponent accordingly. Thus, on 1 a machine having 6-digiU pre- 
cision, a number whose computed yalue was . • 

.003^4724 x 10 4 
would be represented as* , f 

■315472- 2 . . 

rather than 

/ 003155 ' 4 ' - . , 

tf the machine rounds wjiile storing, pr 

: ; 0031/4 4 

if the machine does not round. \ 

For a normalized floating-point number, on a machine with pre- ; 
cision t, the best that can be assumed generally about relative error 
is that it is 5 or less in the t + 1st digit position. (We will talk in 
terms of a machine that rounds rather than truncates to get rid of 



* In this representation, the 315472 is the mantissa, assumed to have a decimal 
point in frotft of the 3, and the 2 is the exponent, the power of 10 by which the 
mantissa is to\be multiplied. A computer will ordinarily carry these internally as a 
single numberAwith an^ffset added to the exponent so it will not be negative. If an 
offset of 50 were -used, allowing a number range from 10" 50 to IO 50 , the number 
would appear internally as 52315472, the first two digits representing the offset 
exponent, and the remainder the mantissa. / . /' ■ 
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extra; digits.) This implies/a relative prror on the order of \ 

,-: ; v V .5/(.5xlO ! ), > -ltH 

■■■■■'.■■■r : » : . ; • . : ■ - * 4 ■ ' ■ , \ '■ . 
This assumes, of course,- -thai all (JigitS in the storied nuipber^are 
significant. If the number represents some measured or estimated 
physical error, or if it is the result of a series of computations using 
stored numbers, it irfay have less accuracy than the precision of the 
machine would indicate/ 

vFor example, if the relative error were actually 1 percent, only 
> two /of the stored, digits would be significant, and the remainder 
would be garbage. Unfortunately, the computer itself provides no 
warning or indication^ this fact; so it is yp to the user to protect 
himself from accept ii^inaccurate data; , v ' r 

3. ERRt)R ACCUMULATION 

/in the standard arithmetic Operations of addition, subtraction, 
multiplication, and division, people generally assume that they 
start with two known operands and that the operation produces a; 
rlumber whfch' is the desired result. In fact, they usually start with 
tW operands which represent only approximately the true, but 
unknown, numbers that are of interest to them. Thus they deal 
with "approximate" numbers as- representation of "true" numbers, 
and the goodness of the approximation is changed by -arithmetic 
operations. .> V . . 

; To illustrate how this happens, consider the results of performing 
additions on two such approximate numbers u r dnd u 2 . Assume 
frtthet errors in 'the two numbers are bounded in absolute vatue by 
Auv and Au 2 , respectively. It is easy to see that |he error in the sum 
due to the error in the operands is no larger, than. (Auj -f Au 2 ). 
Rpund-oflf introduces an additional contribution as described pre- 
viously; so that the total ferior from the, operation of addition (or 
subtraction) can be as large as ^ . 

: V Aui + Au 2 + (1 x 10)~ f | u t + u a I ■ 

■ \- A more meaningful description of the impact of the error is given 
by the "relative error, 55 which gives the size of the error as a fraction 
of the magnitude of the error-free result. This is particularly re- 
vealing in the case where we subtract the two positive numbers u x 
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and u 2 where the relative^rror is given by * . ' 

: . > : ^v±^iX(iJ6 x, 10-) - JSiii' , • ' ■ 

l'Uj,-U 2 | s ; • |u i; -U 2 | i . 

.. ' y' j-^— n +.— ^- J+ i x io- V " - 

, l(Ui + U 2 ) (Uj + u 2 ) 2 J 

with r x = Auj/U! and r 2 = Au 2 /u 2 being the r&ative errors of the 
4 :two arguments.JThe expression in the brackets is no larger than the 
maxiipun^of the incoming relative errors v x and r 2 . However; the" 
expression in front of the brackets is greater than one and can 
become very large if u t and u 2 are nearly equal, greatly magnifying 
the original errors. . 

It is an unfortunate f^ct of life that subtraction of two similar 
numbers in the fixefl word length CQmputer is the^ greatest single 
cause of loss 1 of significance" in typical Calculations' Multiplication 
and division are much more benign, producing ; -relative errors that 
^are bounded by the sum of the relative errors in the operands; The 
existence of floating-point arithmetic does nothing to alleviate the 
problem. It is incumbent on the modern numerical analysj to be 
aware of the problem and to attempt to minimise it. 

Fortunately, there are often steps which can be taken in the 
formulation and implementation of algorithms which can signifi- 
cantly reduce e^ror growth. In many cases a carefuj rearrangement 
of the order in which individual operations are performed will 
reduce the potential error. For example, if a and b are approxi- 
mately the same, it is better to compute a 2 — b 2 as (a + b) • (a — b) 
as opposed to, (a 2 ) — (b 2 ). Tracing the error caused by round-off 
through more complex operations involving many operands re- 
veals that it is generally better to begin evaluations with the smal- 
lest term and evaluate the arithmetic operations with the largest 
operands last. For example, tracing the evaluation of 

■-'•>' * < (a +(b +(c + d)» ; ,j 

with no errors assumed initially in a, b, c, or d gives a relative error 
due to round-off alone of 

(a + 2b H- 3c H- 3d) "■" ■ tA . u 
(a + b + c + d) \ 
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Therefore; fHe error is smaller if c and »d aire the smallest numbers 
of the original set • . *' ^ ' / * / 

larmostfiases^it is difficult to trace the error gro\tfth through a 
complex c6'mj)uter algorithm. In these sujcumstancesr'the computer 
itself can>be'used to Jielp estimate the ^rror growth by carrying put / 
the .'estimates for each individual arithmetic oSperattion, just as was 
done above^ This can be done with competing algorithms, for^ex- 
ample, in order td select the one having 4, the best error character- 
istiQS. ; , ••' • ' . ■ 

The^error estimates that have been described so far are usually 
over-pessimistic. This is because they do not account for the fact - 
that errors are often of opposite sign. This is particularly,ttue of the 
errors introduced by round-off ^hich can be expected to be 
random in direction and magnitude. If this is taken into account, it 
can be shown that the cumu lative error 'due to round-off in N 
operations is less than y/l2N x 10"" 1 with very high probability as 
contrasted with the estimate of N x 10 " 1 t which would result from 
the- techniques described earlier. A similar approach can be taken 
with respect to the propagation of errors ^through a sequence of 
/arithmetic operations, but the assumption of randqmness becomes , 
^ much harder to justify and it is safer to rely on the more cpnserva* 
tive approach in determipinig error bounds. 

The errors described so far are fundamental tpjhe basic arith- 
metic operations of the computer. In addition to these, there are 
the errors introduced by the technique chosen to evaluate a parti- 
cular mathematical entity on the computer. These are , the errors 
that would bq introduced even if there were no error in the oper- 
ands and if the computer had infinite precision. As such, they are 
the type of errors which have been the object of study under classi- 
cal numerical analysis. These, top, are introduced and propagated 
through the ongoing stream of calculations typical of the large- 
scale problems solved on modern computers. Since such errors are 
peculiar to t he /unction being performed, we will point t hem out in 
the context of summarizing the major areas of Numerical Analysis 
after a brief discussion of function evaluation techniques. 

As we briefly describe each of the classical analysis areas, we 
will limit ourselves to examples in one dimension. Extension/ to 
multiple dimension^ are often straightforward, although with an 
attendant penalty in terms of computer resources needed for im- 
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: ' ; . - 1 ■ " v * • 4-'. ' ' ;■ ■•' 

; plementaticpi. Afyitlf few exceptions, there, are ft variety«of algo- 
rithms availabld to' solve any numerical analysis problem on. th^ 
modern ccfiftjputer. For normal applications; it may be difficult to 
/ justify selection of one over another. The best advice to prospective 
probldm-solver^'lHo^dentify'aild at least tvf techniques that are 

* already available on their computers before resorting to the devel- 
opriJent of new onesi ; . V jj? - , . " 

' • " ,' . •■ ■ •• ' ' ' : *J\ >'■'•' 

4. EVALUATION OF FUNCTIONS 

Although not considered a major research area of classical Nu- 
merical Analysis, the evaluation of functions is inherent in every 
application of the computer. These Vange from the comrfion 
trigonometric and exponential functions, to the less frequently used 
Bessel and Elliptic functions. Often there are several evaluation 
techniques to be considered for implementation on the computer. 
For many functions, tables of values are available which could be 
read into the computer memory and used just as an individual 
would look up and interpolate to derive the value at the point of 
interest. However, entering the tables is often a nontrivial task and 
% the speed o£ modern computers 'generally ifiakes direqt evaluation 

* of the function a more efficient approach. In most cases the func- r 
tion is to -be evaluated for values of the argument in some interval. 
Direct evaluation -of functions in the computer is usually based on 
approximation by a. polynomial or a* rational function. The proto- 
type of the polynomial expression is given by the standard Taylor 

* series expression for a function 

- V y . .", . 

/ f(x) = f(x 0 ) + f'(xo)(x - x 0 ) + f (Xi)(x 2 ~ Xo) - + - • 

where x 0 is 'some point in the inter val. The error intr oduced by this 
approximation depends on the number of terms retained in the 
v series and can be estimated by the remainder term derived in any 
* Advanced Calculus text. In the Taylor series this "truncation error- 
is small near the point x 0 and takes on its extreme values at the end 
points of the interval oyer which the function is to be evaluated. 
A better approximation, from the viewpoint of utility on the 



ERLC 



• ' V- impa'ct of compute^ ^bN'NifMERiCAL Analysis 289 
.; 1 ■■ ' v .;; .. j-.. , . • ; ' . 

computer, is given ^ expanding the'function f(x) as • ( 

where T B (x)',is a Tchebyshev polynomfctl of order n. Such a(n, ex- 
pansion tend v to give a uniform error bound over the entire inter- ■ 
val and generally allows' an expansion to a lower order than the ' 
■Taylor series for equivalent accuracy. . 
" , For functions that are to be evaluated a large number of times, a 
greater approximation accuracy (for a given^ order in the poly- 
nomial) can be achieved through; tHe use of rational functions of 
the form 

R„ k (x)- Ijk=ibjxJ . , ^ 

-There exist sophisticated 1 procedures which evaluate the coef- 
ficients, ai 9 bj so as to minimize thQ maximum error in the approxi- 
mation* over the interval of interest. To save computational time, 
such, techniques are frequently used for the standard library func- 
tions provided with the computer such as the sine, cosine, etc. The 
coefficient evaluation is too complex, however, to be justified in 
mdst computer function evaluations. i 
The errors introduced by the truncation of the polynomials used 

.either directly or in the rational approximations are the first source 
of error to be considered in implementing a particular function 
evaluation. This error can generally be made as small as one likes 
depending on. the nuiflber of terms which are. retained. With single- 
word-length computer calculations there is.obyfbusly no point in 
carrying a greater relative accuracy than 10 ~\ V 
Overlaying the error inherent in the approximating function 
itself is the accumulated effect of the errors propagated through the 
arithmetic operations required. These can Be estimated through the 
techniques introduced earlier. Polynomial evaluation^ in particular, 
insusceptible -to the growth in relative fcrror caused by subtraction' 

of similar numbers. * 

Even when care has been taken to minimize^the magnification of 
errors in each required arithmetic operation, the function being 
evaluate^ may have characteristics which tend to magnify existing 
error in the argument. This can be estimated using the first terms of 
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the Taylir series. If vfce pre evaluating, f at th$ approximate argu^ 
/ment«u + Au, then v>e can say; , . " ■ « ' y. 

v % :. f(u + Au)~f(u) + f'(u)Au + 0(Au 2 ? * : 

vFor arguments where | f '(u) | is much larger than one, the error 
^o^tained in the*rgument canbefcbrrespondingly magnified; 

5. NUMERICAL QUADRATURE / V 

-•/•• / ' • :i ■ 

Quadrature, the evaluation of definite integrals, is ;one of the 
areas of classical numerical analysis where the computer has had 
major effect. Prior to the advent of computer, technology, the accq-|! 
rate evaluation of J " -Jf 

\ ; l=Jf(x)dx 

posed : a formidable pfobiem, impractical in all but a number of" 
simple cases or where tHe indefinite integral would be evalu^i<4| 
explicitly. Today such evaluations -are performed routinely/ al% 
though there are still' problems involving quadrature in multiple- 
dimensions'which require unavailable amounts of time even witti 
the fastest computers. • . ; "' J, 

The trapezoidai rule provides one of the simplest quadrature 
. techniques on the comjputer arid is almost a direct extension of the 
definition of the integral. The interval from a to b is subdivided by 
a number of points Xj separated by equal steps of length Ax and the 
integral I is evaluated as V- 

I = (AX/2)[f(a) 1+ 2f(x l ) + 2f(x2) 4- • + 2f(x n+1 )'-f f(b)]. 

" It can be shown that the error from this approximation alone is 
proportional to (Ax) ? . This algorithm has the benefit of simplicity 
but requires the evaluation of the integrand f(x) a total of (n + 1) 
times. If f(x) is expensive (in terms of computer time) te evaluate, it 
follows that the same will be true for the trapezoidal rule.,- - -~ 
In brder to reduce the evaluations required for fl(x) without re-r 
during .the accuracy of the evaluation, a large number of other 
■ quadrature formulas have beeh derived. iThe trapezoidal, rule is 
based on approximating the integrand f(x) bjetween x, arid x,+ { with - 
a straight line; If a higher-order polynomial approximation is used, 
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then the power pf (Ax) app^pg in the error term also increases./ 
Th&mosrt' common fo^jul^fthis type, known a$ Simpson's Rule, 
u^es a quadratic estimate^r f(x) and has an error proportional to 

The tlji^ 'majo^^pToach , usually /goes ;under- v tjie namQ of 
Gtfiissian ^^dra^if^ro'mu^ In this case the equal intejvals Ax 
are s r$>Jaced ^by^||^iiequaily spaced sequence' of points so as. to ( 
/ miaiimize the err^The determination ofthe points and th& associ- 
'/ate^b^fp6k^is-Jb'be used "in the approximation for I is difficult so 
tl&iiM&ffi tabular values are normally used, 

rjfcrrdrs m^^^rkfUre are likely to ;arise from a number of sources 
^^her thSi^^^emainder associated witb the integration interval x 
0r equiy^^iy, with the number of points in the interval at which 
f ^ttx)/^^ated). Obviously, there fhay be error terms associated: 
^j^K^0vkluation of/fl[x) itself as discussed previously. The error 
* -^^^all-bf the quadrature techniques involve the value of a 
^j^lferivative off as well as a power of x. For ipany functions, 
Jribution can significantly increase the size of the error term \ 
evaluation of f unless the integration interval is reduced 



~6i*SOL^TION-GF~EQUATIONS^ 



We have already discussed the icommon problem of evaluating y 
when- r ^ ,j * - : \ [/: 

Frequently one is faced with the converse problem of evaluating x 
given y. When the equation can be solved explicitly for x, this offers 
no difficulty; however, ohe is generally not so fortunate. 

In summarizing the numerical techniques for finding x, we may 
assume jftiat y= 0 since the problem is easily reformulated so that 
'this is the case. Thus our problem is one of finding the roots of uie 
function fl(x). 

Almost without exception, the first steps consist of localizing the 
roots so that iterative procedures,- which work only in the neigh- 
bprhood of the solution, can then be used to find the precise value. 
If nothing is known about the function f^:) initially, the first step is 
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*often the generation of a 1 graph jor table 'b ues Which will give a 
gross estimate of where ]he roots might beV j 
for general equations, one of ;the 'most straightforward tech- 
Vniques for finding a root is'the bi/eptipn method* In this: case we 
must start with an interval, [a^b] where, for example, f(a) is negsa* 
tive and<f(b) is positive, and Pfxi is krfown to be continuous between 
a and b. The function fi.then jSyaluated °£t x (a + b)/2 and the 
original endpoint (a or b) for vvhich^f has the same sign as 'f(x) is 
discarded. The joot is now known to be in the interval wtych is 
half the size of the original. 'This procedure can be repeated to 
specify the root t^VanyvdesireB accuracy vand therefore actually : 
solve the equation 1 However* if f(x) is difficult to evaluate (or if the, 
variable x is a multidimensional vector), it often becomes more 
efficient to apply other; techniques, once'the interval -containing the 
root is sufficiently small/ • ■. •"''/, ^ ' - 

The most cortimonly applied procedure is kpown as the Newtpn-i 
Rapjison algorithm and is based on approximating f(x) with ^ first- 
order Taylor series, expansion, about the . most recent estimate for 
the root. The solutioti for the root of the resulting equation is. taken 
as the revised estimate and the procedure is repeated Until the 
values converge. The speed of cortyiergence depends on the- ratio / , 

*\ - ' 4 ■ Iflx)f^'(x) |/(f'(x)) 2 : , ' ; £, ~p]//->?'% 
near the ropt r £nd~t^ 

td occur at ajj. ' ; v ' /' •/'?Vv--- .• • • : < . 

For some equatiohs, f(x) can be rewritten as •', \% 

Then under certain circumstances, the iterative 'formula x n == 
g(x n »!) will converge to the root if x nr . 1 is sufficiently close' to begin 
with. V : - '•■v'V' g - -.'3 ^ " .,' ,';/. / 

If fifx) is. ^reai-polynbiSiial of ;order greater .than fou^ tjtiete are 
no explicit formulas for the roots in general. However, thefe is a 
wide body ,pt classical algebraic , 4heory4hatc^n^e' ' applied r .to lielfc 
localize dv define /the' rpotsT These range from well;kho\vh tecfc » 
nique$ of factorization to reduce the order of the polynomial once a 
singly rpdt^isj found tprprocedures for identifying the niimber of 
such toots in jan interval based, on the changes: in size of the; coef- 
ficients. EVen in thesa cases, however^ there is no single xnethocl 
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which is completely satisfactory 'for findin&all bftthe roots of a 
■ pqlynpirtiai. ; i <u ^l>. r ' e:' • r,;.';.v / • " " : .-<-. ; - ' ■ 

The OTprs ; which arise in the solutioif of equations^by the . tgtgj- 1 ; 
• niques Indicated are relatively easity controlled. If the iterative 
t^chnitj[U^con\<lrge at all, » the difference, between successive values vv 
will Jbound jlh^error from this source^ A more serious pro$enj is 
likely to: r anMfrom the numerical error jn#aluating the jutiCtion. 
NThis mdy r^lult in shifting or even elimihat|ig some roots.** 

7. NtATkl^ EQUATIONS .''•> *• . ( * > 

: v While trie techniques of the previous paragraphs are tidily ex- ? 
tended to* finding sojutiohs or^rpots of simultaneous Equations, of 

. several vaxi^ble^, systems of linear .equations are\giyen their own 
special category among the classical problems of Numerical Analy- 
sis. This is because of the* frequency with which, such equations 

v arise in sQlyihg" real-world, problems and because of the vast'body v 

\of xelevaht mathematical theory. Computers have made ,a .major 
contribution in Extending the scope'' of numerical problems iiri this 
area that, can be practically solvedtoday. /'•:.••., 

/V';tii:tiiidihi|- the 'Sjdlutipir^f a set of equations of the form s 

a •. '* *' . ? * ' .; • Ax = b • .. . . ■ ' <-\ " -.v 

,.or;-\ , . • ■^•■■ > ' >, 

" ^11^1 + a 12 x 2 + a 13 X3 + • • '+ a ln x n = b i ■:; - 

. : ^ : iMx^-f: 4l 22 x 2 + x 3 + ; • • + a 2l? x n = b 2 ' 

: ;a nl x 1 -+ a n2 x 2 + a n3 x 3 + : • + a no x 0 = b n ' r 
• ' ■ * ■ ■ ■•■ ■* <.-..■*..'■■•' \. • * 

the most common techniques are based on Gayssian elimination. 

Successive multiplication of the equations by a constant and sub- 
traction of one from anothef leayes^eithler ^ diagonal -or upper 
triangular form for the equations .which are then easily solved.- 
-Unfortunately, because of the number of subtractions, involved, 
these techniques frequently introduce a greatdpal of errors 

Another approach (Gauss-Seidel) is reminiscent of the successive" 
substitutions used to find the roots of x •-. <f> (xj. The linear equa- • 
tions are rewritten so that the Jcth equation has a klc x k isolated pn 
the left side. Starting with an initial estimate, the x, are' introduced . 
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on the right side of trie- resulting equations to produce a new &ti- 
mate. When the matrix A satisfies a nominal -set <5f conditions, this, 
sequence of operations can be shown to.converge to the solution. 

A related problem wfyich arises frequently in physical appli- 
cations involves solving the equations /Y : , 

both for x and the so-called scalar eigenvalues A. An entire reper- 
toire of specialized techniques has been developed for solving this 
problem but is beyond the scope of what can b.e discussed here. ; 

In general, matrix operation^ are characterized by jlarge numbers 
of additions, often of terms of a^roiximately/tlie same magnitude. 
This makes them particularly susceptible to loss Of significance and. 
resulting error. If the matrices involved are large, it is always wise 
for the analyst to coi^itirm the results of solving any matrix equa- 
tion by evaluating the original equation with the solution found. In 
some pathological cases, even'this is not enough to expose the 
existence, of IaVge errors. " s •- •. - . ' x : V. 

8. INTERPOLATION, CURVE FITTING, AND , , , 

NUMERICAL DiFFERENTlATION V \ . ■ . . 

\ - . j '■■!'.»' ■■ '* ' ■ v ' i '. < i 

During thejdays of hand calculations'; ;tHe use- of .tables was the 
predominant 'form of evaluation of functions. As a result^ inter-, 
polaiion techniques refceived intensive 1 study in classical Numerical 
^Analysis. As indicated earlier, hjgh-sj^ed cpxnputershaveitended to* 
reduce this application, but it is still used often enough, to be sig- 
nificant. In ahy case*, changing, computer technology offering low- 
cost and higfi-jlensity memories may reverse the trend in the future. 

Classical interpolation techniques are based 0h polynomial ap- 
proximations to the, given tabular function? The co^ffidents^are. 
determined by the requirements that 'the polynomial pass through 
the tabular, values surrounding the; point when the new value is 
desired; 1 The classical formula, Newton, Stirling, etay differ 
tially, in the. grouping, of the terms ijn^ the resulting pt>l^noiniaI or in 
the 1 requirements dri "tabular spacing. Errors inherent in the poly- 
nomial techniques depend' primarily^ on the distance between tabu* 
laf values "and oh the magnitude' of the hjgher-ordier derivatives &f. 
the Yunctipn being evaluated. Onty if it is a polyhomiafbf th^same 
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or lesser degree as that used for the interpolation will the itsults be 

exact. 9 / 

As in the other aspects of Numerical Analysis, the arithmetic 
operations in the interpolation formula may also introduce signifk 
ca'nt errors. 

The computer has made possible significant advances in a more 
complex but desirable type of interpolation procedure knpwn as 
spline fitting. Although polynomials are used to estimate the tabu- 
lar function in this case also, a smooth fit is ensured by forcing the 
derivatives of the polynomials to be continuous from one interval 
to the next. The resulting conditions on the coefficients of the 
estimating polynomials are matrix equations and are too complex 
for solution except on modern computers, t 

The polynomial formula used for interpolation is also generally 
used when numerical differentiation of the underlying tabular func- 
tion is desired. Each derivative, which lowers the degree of the 
interpolating polynomial by one, is a less accurate approximation 
than the previous. As a result, numerical differentiation should be 
used cautiously and only where absolutely necessary. 

Often the tabular values represent physical measurements which 
contain varying amounts of error. In this case it is more reasonable ' 
to evaluate the associated function using a fitted functional form, 
either known or assumed. The functional forms have coefficients or 
other parameters to be determined according to some fitting cri- 
terion, often to minimize the sum of the squares of the differences 
between tabular and corresponding, fitted values. In most cases, 
these problems reduce to matrix manipulation problems of the type 
described previously with all of the associated problems., As a 
result, it is only with high-speed computers that any but the sim-. 
plest problems in curve fitting can be attempted. 

9. SOLUTION OF ORDINARY AND PARTIAL DIFFERENTIAL EQUATIONS 

• ■ * 

The behavior of most physical systems Is described: in_temspfa_ 
differential equation, ordinary or partial. With the use of the cdm- 
puterjt has become possible to extend the evaluation of such equa-* 
tions'beyond the small subset which can be solved explicitly. This 
potential has made this aspect of numerical analysis probably the 
rribst intensively studied over the past few;, decades. Even so, there 

, ' -312- .. 
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rerriain innumerable- differential equations arising in* the physical 
sciences which cannot currently be evaluated. * ' 

Q ■■ Most of the problems associated with the numerical evaluation 
1 : < of. solutions of differential equations can be discussed'in terms of 

f thesingle ordinary differential equation . . . 

.dy/dx = f](x, y); y(x 0 ) = y 0 .' • / 7 

Under rather mild conditions on f, each initial condition y 0 deter- 
mines a Unique solution pr trajectory y(x]^ However, in somexiases 
such trajectories tend to diverge from each other very rapidly even 
if the initial conditions are close together. Under these circum- 
stances, the small errors that are unavoidably introduced, for 
'example in evaluating f, can rapidly grow, to the point of invalidat- 
ing the result. Such differential equations are said to be unstable 
and cause majoif difficulty in evaluation. 

Almost all numerical ihtegration techniques proceed one step at 
a time in the dependent variable, of length h. The simplest 
■';.;/' approach, known a? Euler's formula, is given by ' 

y(x + h) = y(x) + hfl(x,y(x)). ft. ..- '\ 

This is really equivalent to a first-order Taylor series expression 
and the erroivat each step is proportional to h 2 and the first deriva- 
tive of f at some point in the interval from x to x + h." 
• - In practice, the Euler formula is not sufficiently^ accurate for 
reasonable ^alues of h. By considering higher-order* terms in the. 
series expansion and combining the results, one can derive the. 
so-called Runge-Kutta formulas, which give greater accuracy at the 
expense of additional function evaluations in the interval. However, 
even the higher-order formulas reflect the inherent instability in the 
differential equation if it exists. ^ 

A second major category of numerical integration formulas are^ 
the so-called predictor-corrector methods. These utilize a poly- 
nomial extrapolation for y(x) based on a number of previously 
_ _ jgyaluate^ points; The. predictor-corrector algorithms -tend fb_ be _ 
somewhat more stable than the Runge-Kutta methods, although 
*not necessarily more accurate. . » 

The resources of the computer make it practical to determine the 
step size h adaptively tt as the integration progresses so as to use 
"* almost the largest value compatible with maintaining a given accu- 
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racy. This is don? by comparing the error terms from two integra- 
tion schemes of the same order (e.g., a'fourth^order Run^e-Kutta 
and Simpson's' quadrature rule). Under^the assumption that the 
error is no greater than the sum of both remainder terms, the value 
for the size of the next step cart be derived. .Such adaptive tech- 
niques are 'the basi? for most numerical integrations performed on 
computers today." *" ; 

The solution of partial differential equations (PDE), while draw-. 
Jng on the results from ordinary differential equations, is ^n im- 
mense area of study beyond description here. Specialized ap- 
proaches are required for th* major categories of partial differential 
.equations and entire texts are devoted to the study of each cate- 
gory. The numerical solution of the PDE shoirtd not be attempted 
without a basic understanding of the body of mathematical theory 
developed in this area. , 

10. # NEW PROBLEMS IN NUMERICAL ANALYSIS 

While the classical subdivisions of Numerical Analysi&^are still 
very active with many unsolved problems remaining fot&the 
researcher, the use of .the computer has expanded ihterestgn'new 
areas as well. In most cases, these have arisen through attT^ ~ A ~ 
optimize the behavior of real systems through computers* 
~Fpr example, one of the early problems . dealt with|^ 
mization of a linear and nonlinear function of many va< 
Wjjiich were themselves subject to constraints of the/orm 

ak^ + ak 2 x 2 + ••• + ak n x n £ C k . 

The use of additional variables z r , called slack variables, can be 
used to express the inequality as . a 

ak l x l + ak 2 x 2 + + ak n x n + zj = C k 

and reduce the problem to the solutions of simultaneous equations. 
Similar approaches work for nonlinear optimization with in- 
equality constraints. 

Attempts to optimize complex functions of variables taking on 
discrete values and complex combinatorial problems similar to the 
"traveling salesman" problem (which attempts to minimize the 
traveling distance required to visit all points in a complex'network) 
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are still unsolved in general. Current research addressing the 
- measurement of problem complexity suggests that success in these 
areas will always be limited. *:...: 

The use of the computer in the real-time control of complex 
systems is forcing the reexamination of many 'techniques from 
Numerical Analysis with a view toward making their* operation 
faster. Outgrowths of such studies have resulted in the so-called 
Fast Fourier Transform and the Kalman Filter, the latjer essen- 
tially a recursive scheme for doing least squares estimation,/ 

Tt seems clear that man's pursuit of insight into increasingly 
complex physical, processes, from atomic interactions tb weather 
'prediction, wifj require both faster and larger computing capacity 
and innovative algorithms in numerical analysis^ : 
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COMPUTER SIMULATION » 
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1. simulation: models and methodology 

LI Introduction. Digital computer based simulation is rapidly 
becomirig the predominant technique used in the analysis of qom- 
plex systems. The systems and problems tackled by computer simu- 
lation span the range from traditional engineering based systems 
[1], [2] to' biological, environmental, urban, and social systems [3], 
[4]. There are a number of reasons for the wide and increasing use 
of computer based simulation: '•, \v / 

First of all, models df the complex systems with , which we are 
currently concerned are rarely amenable to analytic solution by 
traditional mathemiatical methods. Computer based approximation 
and numerical analysis techniques which lie at the heart of digital 
simulation, however, can usually be used to "solve" such models, 
Second, computer, simulatibn languages and facilities have devel- 
oped to ^poSTt where it has become intellectually easier to formu- 
late, program, and obtap solutions, to these models. Third, the 
decreasing cost ot computers, due to technological advances, and of 
computing,* due to higher-level languages, have made the simu- 
lation approach more economically reasonable. V 
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Finally, over the past few years our society has become increas- 
ingly aware of the difficult problems it faces in almost every sphere ( 
of activity. Such problems as energy, environmental quality, and 
industrial productivity are complex and interrelated. Engineers, } 
natural scientists, social scientists, and social planners are now 
coming together in interdisciplinary groups to try to solve these 
problems. The training of the individuals in such groups often has a 
quantitative component and their commitment is usually to a 
quantitative analysis of the problem at hand. This, added to the 
complexity of the systems they're attempting to deal with, leads 
directly to the use of simulation methods. The common language of 
such groups is becoming the language of simulation with the 
system model being the core to which individuals contribute, on 
which, individuals test hypotheses and argue, and from which policy 
recommendations are |made. This trend will undoubtedly continue 
since it offers such problem-solving groups a common communi- 
cations format, a common problem-solving discipline, and a model 
solu^ioi\ procedure. 

The discussion so far has used the terms system, model and simu- 
lation in a general, intuitive manner, avoiding the problems of at- 
tempting precise definitions. The terms" are used in such diverse 
ways that deciding on reasonable, useful, and yet concise defini- 
tions is difficult. Nevertheless, let us proceed. 

A SYSTEM may be defined as a set of interdependent elements 
acting to achieve some implicitly or explicitly defined gdal. Thus] 
for instance, a computer can be considered to be a system whose 
interacting elements include logical gates and memory units. In this 
case the goal is .explicitly defined by the known properties of thefc 
elements and the manner in which they have been organized. On I 
the other hand, in the case of biological processes (e.g., evolution) 
the goals are often implicitly rather than explicitly defined. . 

Once a collection of elements is recognized as constituting a 
system, then the proces§; of describing the system begins. The de- 
scription itself is referred to ais a MODEL of the system. There are 
clearly vastly different ways of describing or modeling systems and 
a 'number of approaches have been taken [5] to classifying these 
different descriptive modes. Development of such a classification, 
or taxonomy of models, clarifies how the digital computer can aid 
in model formulation and solution and also clarifies what types of 
models are appropriate for investigating different systems. 

317 
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1.2, Model Classification. The classification scheme presented 
here follows that of Mihr'am [6] and is summarised in Table 1. 
One of the main distinctions to be made is between material models 
and symbolic models. A material model represents a spatial trans-- 
formation from the original physical system of interest (the "real" 
system) to some other physical system which in some sense is sim- 
pler, more easily understood, or more easily manipulated. One typQ 
of material model' is a direct replication of the syste^i in question 
but on an altered dimensional scale (e.g., a model airplane). An- 
other type of material model is a quasi-replica of the real system. 
Like the replication, it is a physicalUnodel in which a spatial trans- 

i formation of the real system has occurred; however, in this case 
one or more dimensions have been omitted (e.g., % road map). The 
final type of material model in Mihram's classification scheme is 
referred to as an analogue model. In this type of model no attempt 
is made at preserving the physical dimensions of the real system. 
The, primary objective is to preserve behavioral or performance 
characteristics. Analog computers haveybeen used extensively in 
this way to model systems governed by sets of differential equa- 
tions. The analog computer consists of electronic components 
whose behavior (i.e., variation in voltagejs ^nd currents with time) 
corresponds to simple mathematical ojjerations such as addition, 
multiplication, and integration. These components may be inter- 
coniiected so that overall behavior of the system represents, for 
example, the solution to a set of differential equations. The com- 
puter, interconnected in a particular manner, represents an ana- 
logue model if this set of equations is the same as the set governing 
the physical system of interest. 

While material models attempt to maintain a physical link be- 
tween the model and the real system, with symbolic models this 
link is broken. Descriptive models are one type of symbolic model 
in which natural language, (e.g., English) is used to represent a 
system. The symbols in thiJcase are elements of the language, and 
manipulation of symbols Mlpws the allowed grammatical rules of 

„' the language. A botanist'^ written description, of a plant is a, de- 
scriptive model. Formal models, or formalizations, are another type 
of symbolic model. This type, however, is one in which symbol 
operations-fall within a highly developed mathematical discipline, 
such as integral calculus or numerical analysis. A differential equa- 
tion model of a system is representative of this category. 
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Falling between descriptive and formalization models, and often 
, containing elements of both, are simular models. Higher-level pro- 
gramming languages, for instance, allow users to represent , systems 
very much as descriptive models; however, internal to the language 
processor where formal numerical methods are utilized, the models 
are more of the formalization type. This is one of the reasons that 
simular models and simulation languages have achieved such wide- - 
spread use and popularity. That is, on the one hand, the language 
features available allow individuals to describe systems in a com- 
fortable manner often closely related to natural language descrip- 
tions, while, on the^ other hand, the built-in formal numerical analy- 
sis constructs assure, under proper conditions, model solution. In 
addition, the users of such simulation languages often need not be 
too deeply concerned with understanding the details of thq numeri- 
cal methods utilized/ since these are built into the language pro- 
cessor. Thts allows users to concentrate on the structure and 
properties of the system of interest. Finally, due to the symbol- 
processing capabilities of digital computers, such computers have 
become the primary tool used for the development and solution of 
simular models. 

The models thus far have been classified in terms of the level of 
abstraction used to describe the system. Material models of XhS 
replication type are the least abstract, while symbolic models of the 
fonqalization type are the most abstract. Another dimension of- 
model classification deals with whether the model is Static or 
Dynamic* Deterministic or Stochastic. A static model is one whose 
behavior does not change with time. Thul$or instance, Ohm's Law 
is a simple static model of electrical behavior of a circuit in equilib- 
rium. If the circuit is disturbed by, say, introducing a time-varying 
voltage, then a set of differential equations incorporating Qhm's\ 
Law would be a more accurate model,' and this would be a dyna- 
mic model. If the models described above contain no nonrandom 
elements, then they would be deterministic models. If, however,^ 
there were random elements, then they would be stochastic models. 
Thus, for instance, if the voltage on the circuit Varied in a random 
fashion, the appropriate model of the system would be a set of, 
stochastic differential equations. Such a self of equations would ' 
constitute a formalization model of the stochastic, dynamic vari- 
ety. , , ,;' '. ■ '" ^ ■ ' ; 
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This completes our review of model taxonomy. The remaining 
> discussion centers around simular {nodels*. since these are most 
closely connected with digital computer simulation. 

1.3. Simulation Methodology. Consider now the term "simu-; 
lation" itself. Although.it is often used interchangeably with "mad-; , 
eling" "simulation" implies a good deal more; Simulation is pro-; 
cess containing a number 6f components, one of which is modeling r 
or describing the system of interest. The process can be discussed in. 
terms of a sequence of stages as illustrated ii> Figure 1. : /\- >«<■ ■ 

The first stage is problem formulation and study planning. Deter- , 
minatibn of simulation study goals, and plan of attack, is often the' 
most important and difficult p^rt of a simulation; The difficulty * 
arises because there is a; £lose\connection jtvetwepn what are es-; 
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. Fig 1. The simulation process. ' 



COMPUTER SIMULATION 



305 



tabjlished as study goals and procedures and how well the system to , 
>c studied is itself understood. Indeed, one study goal will often be 
to obtain a deeper understanding of the* system of interest— even 
here, though an attempt should be made at quantifying what is^ 
meant by the goal of achieving a deeper understanding before the 
simulation study gets '"under way. Initially, this may entail 
determining 1 whether certain narrowly defined input/output re- 
lationships exist. For instance, if the system] to be considered is a 
hospital outpatient clinic, one might be interested in how patient 
waiting time varies with the arrival rate of the patients. Once this 
has been. achieved, the problem may be reformulated and expanded 
to include investigation of various design alternatives. For instance, 
it might be preferable, from a mean-patient-waiting-time point of 
view, to schedule patients in different ways depending on their 
expected; resource requirements (i.e., some patients require long 
physician consultations, others short; some require x-rays, etc.). 

determining the schedule rs one aspect of system design and 
control. The final problem formulation will often relate to bptl- 
tnizing the system design. Optimization requires that a'measfrre of 
system performance be accepted by those individuals ^in volved in 
the simulation study. Once this measure of performance is decided 
upon, then system design alternatives can be examine^ in terms of 
maximizing or minimizing this performance measure. Preliminary 
decisions on what this measure should be are part of the problem 
formulation stage. This can be a nontrivial task, ^especially when 
individuals with conflicting personal goals and backgrounds are 
.connected with the study. Often a realistic performance measure 
can be formulated only later it! the simulation process when greater; 
understanding of system operation has been achieved. The out- 
patient clinic example illustrates how such performance measure 
differences may occur. The example used outpatient waiting tjme as 
-the measure of interest. Note, . however, that this might not be 
consistent with a dollar profit measure on clinic operation, or per- 
haps with a measure based on clinic resource utilization. 

Closely connected with establishing study goals is the question of 
the study plan. A clear plan for achieving the study goals will 
include time and dollar estimates for the various stages in the 
study. Stating, in written form, both the goals of the simulation 
study and the resources required for successful completion is es- 
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of a large organization. For instancfe, jn>it large corporate environ-: 
mcnt numerous individuals and"grpups may hayc'an in ■; 
development of a corporatp firi&n$ia^^ will 
often be needed to provide data, to aid. in model development, and J 
to provide financial support for the: ' 
of the study and a commitment to 'the general study. plan is there^ 

fore vital. '■ :■ ] r^' ~: c ..' ' f. 

The second stage in the simulation process relates to "system 
'definition. The entities or primary objects of .interest in. the system 
must first be identified. In ihe . hospital, dutp.atieh 
. pie, the entities mightbe the patients, doctors; nurses, and X-ray ; 
\upits in the clinic. Asspdated^itnah^ attributes' 
#whieh denote various^entity , prtopertids;^hus; associated with the 
^entity physicians might! th^sfatic^attribut^ ^quari^-'and^spe-'; 
cialty" and the dynamic^ 

associated with a particu^ at» 
point in time;r6pripsent the st«i?ij? J3f/ ffie entitj^lv^hiia the .collected ; 
states of all, critical Entities m-tha system^represent^he srafe of the 
system. Notice that the. ^n?mip attribute **busy/n0t b jiSy" indicates, 
that an 'activity is in process. SuGh^aqtivities and theirs 
. ships determine, . in par0tt>w ^he>system ^ate.,wili; ^bly€i in timje.; 
Static or clasp relationships betweetf entitie^ ajsp b6 specified £ 
(e.gl T associated, with each X-ray H uiiit^ nurses required to 

operate tjie unit!) when needed for sysfem definition ^ 
' In addition to, entities, attributesractiyitfesv apd states, iKe^Stem; 
boundarjes musftbe .d^hec};ff he s^em qf Concern is. said to exist 
in . an enVironfnc^.^The system bouridartes/determirie? tlib^e^titi^^ 
in the environment; (ie„ outside of the bpunUanes) ^fiich afleet* 
system activities but c6n whi£h thef system itself has^n^^ct.^^ 
turning to ttie oiit^atie^^ 
' represents^ an entity in ^he^ environment whiob, will a^ 
operation^by^alteniig the Arrival jrate of paftien'ts., , . y^i* ■. ' 
^nbthei^asp^ 

\ system of irrterest into Subsystems. Stictir subsystems sachjepresetit 
arv QAtity^dr- cfoljectid'n pf entities whiqh, in's^me sen^e, ' act^as; a 
unitin rei^(ioiiships with- oth^r subsystems The objective in defin- 
ing f subsystem&:is .^t6' simplify speqficatiQn .of system interactions 
and| £cthfties.F^ers^ generally nieanf that fswfci; 

attributev^^ needed in^describing th? system. 
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The question of what constitutes a subsystem is closely related to 
the development of hierarchies of system definitions. Such hier- 
archies usually represent differing levels of detail which may be 
used in defining subsystems. An example of this is given in Figure 
2; Say that the system of interest is a computer system. At the 
•'highest" level, a very low detail representation of a computer may 
view the computer as a simple server with given statistical execu- 
tion, properties which acts on incoming customer programs to 
produce certain program execution results as output. Given a sta- 
tistical description of the program arrival process, the length as- 
sociated with the queue of incoming programs, the discipline as- 
sociated with selecting jobs from the queue (e.g., first come, first 
served) and a statistical description of the service process, the goal 
of ;a simulation study might be to find out what the waiting time is 
for customer; programs (i.e., response time of the system). Such a 
low detail-level systq ^ ^ description could hot by itself be used to 
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; determine jlie ttflcct qfthc speed of^say,\the inasrstor«ge device ony 
■/ respfmsftiW ncejtejL 
;< t, Ldycl 2 tluis considers, the Computer syStcnfcas n$de up of ihtee 
/ t primary subsystpnis: central processor, central' memory, dnd mass 
' • storage. Notice thai othcV subsystems,; sucl) as the, system >rinte^ 
imight ^iiso bfr^dde^l at ;this level.*:The Question of what subsystems " 
% JshduldbS. included ^t each representation levfel is clearly a matjer v 

bf^judgmpnt. 1^ it w^s felt;ihat the systen? printer ■ represented.^ 
\. possible bottleneck orjlimiting resource in system opcratioii, v then if;; 

V would Have to^bc included.; Whi|e it wijj somdthnes bis necessary to. : 
incltid4a subsystem in the' 'hjodelv'descri^tion\in^Qrder to detefirrine 

r its relevance to.system Qpe^tion; the tendency to construct pveriy , 
' 'ebinpiM^odds^ouldi^e avoided. A(dded subsystfcn?s medn added 
. cpmplexity vand cost iii; programming,;' validating, verifying, apd 
generally understanding what's going dif in the model. Initiallyi it is 
>, usually bettor to err-on the side of v: simpticity rather than com- 

. V 'V plexity. . / . "Vi" . ' 7-V ' - ■ i /'i :v .\ 

V Pursuing this, example a bit^further, on£. can identify at least 
^hree ni^rc leve)s of ddtail. Thd ^entraKproces^ for 
• ins tarice;^ can be represeifit§4 i n v Verins of the registers, 1 ogical gates, 
and, information transfers between; them. : t his woutel be an appro- 
priate reprcsentattoijal level, jf the object ives W the simulation s 4 tudy 
•V;, concerned How. gate failures affected irtpepssor performance. Indeed 
}, '/; x such . s^iesfkre often- undertaken with a view toward designing 
f : . \faiflt dmectiori arid diagnostic capabilities for computer systems. 
. 'A lefef below, the register and /gate level would be the circuit and 
^ ^ one level below -this would view 

-i:'- the compVrtents iri.terms of the basic physical laws governing their 
- V. ■ operation.: 

v w ;the system definition stage discussed above is very much an 
analysis stage in that the basic components which make up the 
f\ modeVand the bounds on these components and the system are 

V ^ defined. The third stage, the model formulation stage, is a synthesis 

stage, in that the concern here is with the^overall structure and 
interrelationships between the model components. In-tfie model 
formulation stage, just ho$..the various subsystems and activities 
affect each other must be defined! Choices must be made, for in- 
stance, as to whether a deterministic or stochastic model should be 
used. Indeed, at this stage the general question of model type (e.g., 
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material, symbolic, etc.) must be resolved. Of course often the 
model type will be dictated by decisions mudc in the stuges ulreudy 
djscusscd. Given, say, a dynamic model, these dynamics must bo 
specified. Sometimes this is most easily done by means of logical 
flow diagrams, while sometimes it is convcnie^T to write down sets 
of equations which govern these interactions,' If a simulation 
language is available, it may be possible to conveniently describe 
these relationships directly in the language. 

This leads to the fourth stage, model implementation. Clearly, 
implementation problems and capabilities affect, and are affected 
by, decisions made in the previous stuges. In terms of digital com- 
puters, model implementation relutcs to how the modeHormulated 
is mapped into a correct sequence of computer instructions. 
Simply, how does<one produce a computer pro-am corresponding 
to a -given model? Two separate concepts should be identified here. 
The first relates to the descriptive question of how the model 
characteristics are represented in the computer's language. The 
second relates to the numerical techniques used to "solve" the 
model. This flatter question is treated in part in Chapter 7 on 
Numerical Analysis and is not dealt with here. References to some 
of the standard Avorks in this field are provided, and additional 
references will bi noted later. ' 



The descriptive problem has been eased considerably in recent 
years by the wide availability of a host of simulation languages 
which are often specialized to certain problem areas. For continu- 
ous systems (i.e., the variables in the system are continuous func- 
tions, usually of time) such languages as CSMJV Continuous 
System Modeling Program [7], [8], [9], and DYNAMO [10], [11] 
may be conveniently used. For discrete probabilistic systenjs (i.e., 
the variables in the system are stochastic in nature and change in 
discrete steps) such languages as GPSS, General Purpose Simula- 
tion System [12] [13], GASP II [14], SIMSCRIPT [15], and 
SIMULA [16], [17] may be used. The GASP IV [18] language is 
available for those systems which are best represented by a mix of 
continuous anjd discrete probabilistic variables. More specialized 
languages are also available f&r modeling very specific types of 
systems. For instance, ECAP, Electronic Circuit Analysis Program 
[19], is a language which allows for direct modeling of electrical 
circuits, while ICES STRUDL-II, Integrated Civil Engineering 
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Systems Structured Design Lunguagc [20] may be used to model 
clastic, staticully and dynamically loaded, framed structures. Scv-" 
cnil of these languages will bo examined in some detail in the next 
two sections of this chapter, It should be noted that while the 
numerical techniques needed to solve the system arc usually built 

• into the code produced by the simulation language compiler, some 
individuals prefer to code these numerical algorithms directly in a 
higher-level language such as PL/I or FORTRAN. This generally 
gives one more control of the- detailed implementation question's 
related to the numerical algorithms and often results in faster exe- 
cuting simulations. On thc^othcr hand, this approach typically re- 
quires more programming effort. In addition, while theflrcsulting' 
program will indeed be a fornV*of model description * the overall 
structure arid organization of the system will usually be obscured 
by the detailed level at which coding must proceed. By program-, 
ming in a simulation language the capping of the system model 
into the language commands is often more straightforward, with 
the program more obviously representing the system of interest. 
Model documentation and education problems arc therefore some- 
what cased. This will become clearer in the settions to follow. * 

Once the model has-been programmed on the computer, 
questions relating to its "correctness" and "goodness" must be dealt 
with. These questions have been separated into two parts. The first, 
model verification, considers how well the model responses, as now 
programmed on the computer, correspond to theoretically antici- 
pated results. The questions here concern the basic soundness of 
the model. Has the program been fully debugged? Is the random 
number generator working properly? Do simple parameter sensiti- 
vity tests -perform as expected? Perhaps certain model operating 
conditions correspond to a system which can be solved analytically 
(e.g., transform a nonlinear model into a linear one, transform a 
complex queueing itiodel into a simple single server queue). Under 

„ such conditions does the model output correspond to the analytic 
solution? The objective heresis thus to gaiiy confidence in the 
model's inherent operational characteristics. Often errors will be 
found at this stage which will cause one to reformulate the model. 
More often than not programming errors will be discovered which 
will result in at least partial model reimplementation. 

Model validation, the next part, is specifically concerned with 
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how well tho implemented model represents reality. Comparisons 
here arc comparisons of model outputs with real collected data. 
Clearly, this cannot always bo done. Often a simulation is used to 
investigate conditions or systems which do not yet exist, and there- 
fore for which no data is available, If these new situations represent 
reasonable extensions or modifications of existing systems, then it 
may make sense to structure the model so that both the existing 
system and new system arc modeled within the same framework. 
Validation of the submodel representing the existing system can 
then provide some added credibility for the new system model. 
When this can't be done, more effort should probably be allocated 
to model verification. References [21], [22], and [23] deal in part, 
with some of the statistical questions rclating to model verification, 
and validation. . 

Given the preceding stages the modeler is now in a position to 
'begin experimenting with hi$ model This stage consists of experi- 
mental delign, Execution, and analysis. Although the first stage, 
problem formulation and study planning, has dictated the type of 
experiments to be performed, the details must now be set. The large 
number of variables present in many simulations can lead to exces- 
sive computer running times and costs unless some care is taken in 
this experimental design. Optimization problemfirvfor example, 
which require multiple runs with parameter Variations can be 
costly even when care is taken [24]. This is c^>ccia1fy true with 
regard to stochastic models where multiple runs or extended runs 
may be needed to resolve questions of stochastic convergence and 
statistical validity. This latter problem is dealt with in part in 
Chapter 8 on Computer Science and Statistics. More information 
on experimental design can be found in [22], [23], and, [25]. Care- 
ful experimental design is also necessary if the experimenter Mto 
learn about and be able to analyze the system in an orderly, yuc- 
tured fashion. It is very easy to become overwhelmed with reams of 
simulation run outputs, each run having, for example, somewhat 
different parameter settings or initial conditions. 

When the analysis has proceeded to the point that the goals ini- 
tially established have been satisified, the final stage of the simula- 
tion process, documentation, is entered. Three types of documen- 
tation can be distinguished. The first concentrates on the results of 
the simulation study. That is: What has been learned about the 
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system of interest? What are the answers to the problems posed ig 
the first stag^. The second relates to documenting the simulation 
program itself so that it can be understood and, if necessary, modi- 
fied by others: The third concerns the use of the simulation by other 
parties who perhaps would like to use the simulation to run their 
own sqt of experiments. This latter document, in effect, is a user's 

manual, v : * '-*«fifc 

This completes the introductory section of this chaptgi^he re- 
maining sections concentrate, on the properties and use of several of 
the more popular simulation languages? 

■ ■ ■ / ^ • ■ - • '.. " v '■. ■ 

2. CONTINUOUS SYSTEMS SIMULATION , : 

2.1. Introduction. Continuous systems, are those in which , the 
state variables change in a smooth or continuous manner. Most 
continuous systems simulation is devoted to the solving of systems 
described by sets of differential equations. Simple differential equa- 
tion models, such as those with linear, constant coefficients, can be 
solved without the use of the numerical approximation techniques 
central to continuous systems simulation. . In these cases the] use of 
simulation techniques may still be desirable, however, because of 
the ease with which such problems caritbe represented and solved. 
Once the nohlinearities 'associated with most real world problems 
are brought into the differential equation model, it is usually im- 
possible to solve these models without using simulation techniques. 
It 'is in this area of complex, nonlinear systems that simulation 
methods are indispensable. y' ' ' - • " 

0 The remainder of this section on continudu^systems simulation 
contains three parts. The first presents two examples of continuous 
systems. The second presents a block-oriented simulation language, 
Block/CSMP, which is well suited for^use on a small computer? The 
third presents an /. equation-oriented simulation^ language^ 
360/CSMP, which is widely used on large IBM computers - These 
languages are each used to model and solve the, example, systems. 

2.1 Continuous Systems Examples. For the first example consider 
the simple mechanical system of Figure 3. The system consists of a 
mass M, syspemlg d from a rigid structure by a spring with stiffness 
K(X). This is in turn cpnnected to a rigid structure below through a 
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Fig, 3. Simple mechanical system. .- . - 

dashpot with damping constant D. The object of the simulation is 
to fin.d out how the mass oscillates with time if the mass is initially 
pulled down (i.e., the spring is extended) and then released. Such a 
simple model might represent part of the suspension system of an 
automobile With the spring' corresponding to a suspension spring 
and the dashpot corresponding to a shock absorber. The objective 
of the simulation might be to determine the spring/shock absorber 
combination which produces the "smoothest" ride. 

The differential equation which describes the system can be ob- 
tained directly by using Newton's Laws. The for?e needed to accel- 
erate the mass is 6 M(d 2 X/dt 2 ), where d 2 X/dt 2 represents the second 
derivative of the space variable X with respect to time. The force 
exerted by the spring is K(X): The notation K(X) indicates that this 
force is a function of the position X. For this problem the function 
is a nonlinear one which has been empirically obtained and is 
plotted in Figure 4. The force exerted by -the dasfipot is proportion- 
al to velocity andNte given by D(dx/dt). In this system, nq other 
forces act on the nu*s$: hence the nonlinear differential equation, 
governing the system is:^^ 



M 



dfx 

dt 



:dx 



2 +Dj+ K(X) = 0. 



. Note that two initial conditions must be provided in order id solve 
the system. "v. V . 
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Fig. 4. Spring force K(X) vs. distance X. 



It is sometimes convenient to rewrite the system equations in a 
somewhat different form. First solve for the highest derivative 
d 2 X/dtf. '■ y'.. 

, dt^=-Ivl( D dr + K(X) } V; 

Next let the first derivative of X with respect to t be equal to 
XI DOT and the second derivative be equal to X2DOT. The system 
described by (1) can now be described as: 

, V X(t) = JxiDOT(t) dt + X(0) (2) 

XlDOT(t) = J X2DOT(t) dt + X1DOT(0) (3) 

X2DOT(t)^ ~(1/M) (XlDOT(t) • D + K(X(t))) , : (4) 

Where jo indicates integration from time 0 to^ time t, and X(0) and 
XIDOT(O) represent the initial position and velocity conditions 
present on the massf. Notice that these equations are fairly obvious. 
Equation (2), for e^ample^merely says that X at time t is equal to 
the integral of the derivative of X from 0 to t, plus the initial 
condition on X. Thus rewriting the equations in this form empha-* 
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sizes the integrations which govern system operation rather than- 
the differentiations. The primary reason this hais traditionally been 
done is that .the integration operation; both when performed elec- 
trbnically.on an analog computer, and v when done numerically on a * 
digital computer, is a more stable procedure than the differen- 
tiation operation. Note that any system of differential* equations 
can be redescribed in the manner given above. 

• The second continuous systems example is taken from the field 
of population dynamics. Variations and extensions of this example 
have been used to study the interactions of different populations 
with each other and with the environment in which they exist [26]. 
To begin with consider a single population of size'N. On the one 
hand population growth will be directly related to population size 

\Le., the larger the, population, the morfc births and, deaths). On the 
other hand, as the population grows it, will exert an increasing 
impact on ^its; Environment. Given a finite or restricted environ- 
ment, at some.point demands on the 1 environment will, limit further 
population growth. Thus the larger the population /the lower the 
expected birth rate. These two ideas can be used to form a simple 
model of population growth. The resulting differentyi; equation, 
called the Verhulst-Pearl logistic equatibh, is given below 

/ r ' * dN = aN ^bN 2 .^ ^ (5) 

The first term, aN, represents the rate of increase of a population of 
size N if there were no resource limitations with "a" being the 

^intrinsic rate of natural increase (i.e., birth rate mjjms^death rate). 

:?*The second term, bN 2 , represents the inhibiting effe^^Ta limited 
environirient, and hence finite resoui^ps. The parameters a and b 

, are referred to as the "logistic" parameters. 

H A simple extension of this model considers two species, with 
populations N x and N 2 , competing with each other for the same 
"resources. Th? equations for this system would be: 

^^(ai-buNi-b^N^, 

•V; ' " ; " 

■ 'r" ; • "dt 1 = Nz(a2 ~ b2lNl ~ b " N2) - 
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In (6), a x , b^ x , and a 2 ^ b 22 ate the logistic parameters for each of 
the two populations when living alone. The.bfr and b 22 parameters 
indicate the effect of each population on the other. One interesting 
aspect of this model relates to relafiye population growth as pa- 
rameters of the 'equations are varied. It can be shown [26] that 
under certain conditions one population can grow while the other 
becomes extinct, while under other conditions an equilibrium situ- 
ation can occur with both populations remaining viable. Notice 
that both of these models represent a continuous approximation to 
a discrete system; That is, the values of N can only be integer, and 
can only change by integer amounts in tjje "real" system while,the 
model allows for fractional values and changes. F ^ 

To conclude this section let us rewrite equations (6) in thJ inte- 
gral form similar to equations (2)-(4). . V . , 



N x (t) - 



,N 2 (t) 



NxttXa, - b u N x (t) - b 12 N 2 (t)) dt + N^O), 



(7) 



N 2 (t)(a 2 - b 21 N x (t) - b 22 N 2 (t)) dt ; + N 2 (0). 



%Nx(0) and N 2 (0) represent the initial sizes of population 1 and 2. 



2.3.^ The Block Continuous System Modeling Program 
(BIock/CSMP). The BLOCK/CSMP simulation Janguage present 
ted here , is a successor of the 1130/CSMP language which was 
originally available on the IBM 1130 computer [8]. Versions of 
this simulation program have run on numerous minicomputers. 
The language is easy to use and interactive versions have been 
dev61oped. These interactive versions allow the user to ' specify, 
modify, and run the model while on-line to the computer. The 
immediate feedback of model solutions from the computer to the 
user not only speeds up the debugging process but allows the user 
to experiment with the model in a natural manner letting the user 
follow interesting leads and insights in a straightforward "hands 
on" fashic^ft 4 

Use of BB^)CK/CSMP can be divided into the four phases given 
below: . 

1. Configuration Phase-— The user defines a block diagram rep- 
resentation of the problem. 
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2. Parameter/Initial Condition Phase— The user specifies model 
parameter, initial condition, and function generator values. 

3; Timing/Output Phase— The user specifies numerical integra- 
tion, algorithm type and. step size, output variables to be plotted, 
and output time step size. 7 

4. Run Phase— The user runs the simulation, obtains the output, 
and determines what to do next. 

For phase 1, the configuration phase, the user must first specify 
his problem in terms of a block diagram. There are about 30 blocks 
-available to the user in. defining his model. The blocks can be 
broken down into five types: (1) Linear Continuous (e.g., Summer, 
Integrator); (2) Nonlinear Continuous, (e.g., Multiplier, Sine); (3) 
Nonlinear Discontinuous (e.g., Absolute V&ue, Limiter); (4) Con- 
trol and Timing (e.g., Relay, Unit Delay), and (5) Special Functions 
(e.g., Time, Jitter). A list of ten of these blocks is given in Table 2. 

To draw a block diagram for a system initially described by a set 
of differential equations, it is usually easiest to begin by transform- 
ing the equations into an integral form similar to that shown in 
equations (2) -(4) and (7). Corresponding to each integration oper- 
ator, an integrator block will be required. For the nonlinear spring 
example the output of eaph of the two integrators required will be 
X(t) and XlDOT(t), res^ctively./Now X(t) has XlDOT(t) as an 
input; hence the two integrators are connected as shown below: 



X2DOT(t) is the input to the integrator which is defined by equa- 
tion (3) and produces XlDOT(t). X2DOT(t) itself, however; is, equal 
to XlDOT(t) and X(t) multiplied by the appropriate parameters 
and summed together -as indicated in equation (4). The function 
generator and weighted summer blocks in the simulation language 
can be used for the purpose of implementing equation (4) with the 
Mock diagram for the entire problem given in Figure 5. This dia- 
gram clearly and visually indicates the interactions and feedback 
paths which are implicit in the system's differential equations. A 
similar diagram for the population dynamics problem is also given 



X2DOT(t) J 



0> 



XlDOT(tt 




X(t) 



in Figure 5. 
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Table 2 
BIock/CSMP Block types. 



Block Type 


Language 

Symbol 


Flow Diagram 

Symbol ' * 


Operation 


Weighted 
Summer ' 


W 






y . + P 2 X 2 + P 3 X 3 


Integrator 


I 




y p !+ /< V P 2 X 2 +P 3 X 3 } dt 


Inverter 








Gain 


G 


'■■-€F- 


.' . T 


Conetant! 


K 




' • : . 


Multiplier 


X 




y- x r x £ 


' Sine 

/ ' 


a 

S 




y - Sin(X 1 ) 


/ 

Function 
/ Generator 


F 




y - «<jy 

P 1 and P 2 define upper end 
lower bounds, eee text) 


timlter 


L 


H 1 P>^ 


If X l SP 1 ; y - P t 

if p^x^j y - ^ : 

UX^; y - P 2 



\ 



; Given that a block diagram has been established, a unique in- 
teger must be assigned to each blopk in the diagram. The block 
diagram may now be entered into the computer. In some systems 
this is done by means of punched cards. Figure 6 indicates how this 



COMPUTER SIMULATION 319 



* ..■.'» 




Population Dynamic • ProbUra 

Fig. 5. Continuous systems examples (initial conditions omitted). 



JnfprmationJs^ntered-in-an interactive system where the user has 
the option of nkving the system guide him in entering the infor- 
mation. User inputs follow the "?". 

The first set of user inputs consists of 1,W S 4 S 2. This says that the 
block assigned number 1 [s a weighted summer, W, whose inputs 
are the outputs of the blocks assigned numbers 4 and 2. The re- 
maining three configuration specification inputs complete the de- 
scription of the block diagram to the simulation program. Hitting 
the carriage return key on the computer input device in response to 
a ? alerts the program to enter the ,next phase. 

The parameters and initial conditions phase is also illustrated in 
Figure 6. The first input 3, -10 relate.s to k integrator block 3 and 
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V. j/;. \*. • . - CONFIGURATION SPECIFICATIONS, / V* i,,, ^ V v 

'• V o«TEFi:Tiock',-spECirifiATiOMs* < -• '■* >v,/ . . 'jTJ.V j 

■' SEPARATE ENTRIES WI TH COMMAS *: C ^ V" > ■' 

- s *■ CARRIAGE RETURN ENDS THE" "OPERATION \ * . \ ■' 
'<•• BLOCK- NUMBER^ BLOCK,. TYPE* -INPUTi C , 3lNPUT2» INPUTS J ~\ i 

•■■v ti,w#>*2 ■ ■, ■ • , / s :v . t 

■■ i ■ - ■ v • . - ;■ y , (ft-. >-> 

> i ■ :r- ?a#U! . -i' ■ i' .X: ; < , ■•■ " 

■ : : ■ *' "■■ • • V" * - "A ... _ ... . 7; * >.... ? '> 

■ • ' '-. -w# f/* "TV w,- v v .> • 

\?4/f«3 :\./;.~V"^. _ •; v . t .' -^tt • ■ 

■ V .K v ; . V: INI tjtAU CflNDI Tl 0N§ «AND 7>jiRANET'ER$ : 'i ' i" > 

*(' : ' ' *' ENTER t BLOCK N UMBER* I C"/P AR li:^PAR2# PARS 1 . \ ;\. - : -' \. ■ ' V 

/^.•'r ; t ' SEPARATE 'ENTRIES V^TH COMMAS ' s / \W 'f v " T. ' -*\A-'- 



CARRI AGE RETURN ENJJS VTH E l OPERAT; QN ! V.' ^ 



; ?\ - | 



r^CfI0N.GENERA^ORS' ; SP'ECinCATi0NS/ 
r ' -' . .; v . ■ *■ " r V ". ■" ^.-t' ' ; - v-'r - *~ <■ v ^ 

CSMP- l'l ALLOWS NLY , 3 FUNC;TLQN G ENERATO&S > 4 
* ENTER. fflE FOLLQVJ NGt ■ r->*'-- 3 g ;; -r./ - 

>>BLOCK NUMBER* 1 rUNCTION" INTERCEPTS - ^1 TO-* "> v ■■ - : / ^ 
^ FUNCTION INTERCEPTS. #5 TO 8 ■ ; ■" - 

\ " JimCTiON- INTERCEPTS «CC) T0 ;11 v V ^ , 
^SEPARATE - EN TR I ES WI TH, CQMMAS 1 • ■ ... . ■ : ^ , '» " .-" ' 
. .CARRtAGE RETURN ENDS THE OPERATION V'-V" ! :-^ v 



i*- . ' • ■ - ■ ? A> - 1 00 < * 64'/* - 3 6 ^ - 1 tfv , : ^ , ' ^ -J * 

r ■ ",x . v c - . - a-. * 0 .'0/ 2 • 0* i* • V, : ■ ■■■ ■ ■ ; ... - v ' v 

■ ^ r vC36.<64^00' - •". r ^ . ' \T 

>-.-■■ A >"s - . • ,. V' t .... :. v v : ■ - 



■.a ■; ; ■ 



- - - w •■ ; *" ; '• ' ■" *;ti m 1 Nd 1 n form at i 0 n < enter deci m al -po 1 nts > 

4 NTEGRAtl ON INTERVAL«*I J '* ""^ ^ ' 'f ^ ..-, o '' 

. > r'; # '." r 'total : TjiM^asT^,.. " ' V ^^i^" . "v --^ ' ^y:'< 

: Fig. 6. Block/CSMP input statements for the 

• ^specifies the imtiar^oncUtibns^o^ The second injiut 

defines ihe values by ^hich'thp t^q inp^ the weight^ Summer 
arp^to multiplied. The and>.; 
minimum values ^whiejti 4 are tp be provided ^^beLiunctl^^? 6 ^ 
atbr. .i?he function geite^ ppiftts. ^ 

-^The points vary fro|ii l \ 10 to :4r 1^, andjhe ^lues-as^iciated With 

,< each ppirit are ^veri in the function generator speciflcatioii.,Thbsr 
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points correspond to the curve of Figure 4. A linear interpolation 
'mejthbd is used so that for a particular X input the program pro- 
vides a K(X) value "output. 

•:'.V'-Tjmiiig information related to the integration step size and the 
total running time for the simulation are now entered. Other infor- 
mation, not shown in Figure 6, relating to the frequency of printing 
the output, the specific variables to be plotted qv recorded, and the 
integration method to be used must also be entered as part of 
s phase 3. The results of running the nonlinear spring model with the 
set of conditions indicated jn Figure 6 are shown in Figure 7. The 
left-hand columiv indicates the time, while the next column indi- 
cates the value of the ? output of block 3, X, at tfiat time. The 
remainder of the output is a plot of these values. The plot clearly 
shows the damped ' oscillatory Response jbf the system with the 
period of oscillation changing with time due to the presence of the 
nonlinear spring. Other plots with differing^pararileter values can 
be easily obtained: In addition the configuration statements can be 
saved on magnetic tape for future use. ; 

For/ completeness the input specifications for ' the population 
dynamics example are given in Figure 8. Instructional comments to 
the user have been omitted in this case. The output, though not 
provided here, indicates that, with the given pi&ameter values, a 
stable population mix is obtained and the equilibrium sizes of the 
two populations are about 52 "and 1240^ respectively. 

2.4. The 360/Continubus A ^System Modeling Program 
(360/CSMP). The 360/CSM P simulation language [9] is available 
' on 'many IBM computers. The language is a good deal more 
powerful than the Block/CSMP language just discussed. One can^ 
for instance, write Fortran statements in as part of the simulation, 
and provision is made for creation of user subroutines and special 
funqtipn^. The language itself provides a wide variety of simulation- 
oriented function^ (equivalent to the blocks in Block/CSMP) and * 
also contains' simple*, procedures for automatically controlling^ 
-■-"multiple model runs under varying conditions. Many of the bl6ck 
.types explicitly available in Block/CSMP are automatically ay^il- 
able' as .simple Fortran constructs. Note that some of these 
language differences are due to the fact that Block/CSMP is "inter- 
preted" while 360/CSMP must be compiled using the Fortran com- 
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^ RE5tX-T5 PRINT-OUT 

TIME OUTPUT? 3 -0.100005*02 ° 0 v *70000E*0l'" 

0.000 -0. $0000E*02 ♦ " J.; 

0.500 -0.78020E+01 I— — ft'' ' Iv 

U000 T0*32697E*0X I ->~ 1 

1.500 0* l0539Et0l. 1 I— — — — — — " —-——<-♦ J 

2.000 0.4477B£*0l~ I-' ♦ I 

2.500 0.633 l0E*j0l I-—-— -------- —————— — — — — — — — I 

3.000 V0*60605E ; * V 04 I—— --———,- ;——-——— ——————-——♦ I 

3.500 0*42375E*0l I— ------- ^ £ * J 

4.000 0* 18908E*01 I * , J 

4.500i -0. 19661E-00 X 1 ♦ J 

5.00fiT-0. 16965E*01 I * — ♦ J 

5.500 -0*3l522Et0l I — ♦ J 

6.000 -0.37349E+01 I . ^ J 

6.500 -0*35865E*0l I---------- ------ ♦ * 

7.000 -0*2866lE*0l I — — — - — — — — ■ p 

7.500 -0. 1 9509 E* 01 X-———--—---——- — * * 

6.000 -0»l0647E*0l I - J 

6.500 -0.29129E-00 I J 

9.000 0.35501E-00 I * ,v J 

9.5.00 0.86805E-00 I*-— ---------------- J 

10*000* 0.12486E+01 j ♦ - I 

18.500^ 0. 15042E+01 I ♦ I 

11.000 . 0.16453E+01 I — — ♦ ^ 1 

IV*500 0*i6866E*0t>- I———————— — -— — ♦ I 

-J2.000". 0* 16442E*01\ I — —————— ——————•»♦ I 

12.b00 0* 15353E+01 \f-~st.— — ——-—---—---— -------- J 

13.000 0.13766E+01 I---- ♦ . J 

13* 500 1 0. 1 1849E*01 I— -«r -V * J 

14*000/; 0*97435E-00 I ■ — ^ * 

U*500 0.75800E-00 I— «■— — - a*— — ^. I 

15*000. 0.54666E-00 I i---- ♦ „ X 

15.500 0.34899E-00 I- ♦ J 

16.000 0.17141E-00 I— — — ♦ J 

16* 500 0. 18306E-01 I— *»—-——-- — — — — — — ♦ #1 

17.000 -0.10785E-00 I— —————— —————— ♦ i ^* 

17*500 -0.20626E-00 X- *T ♦ 3 \ 

18.000 -0.27750E-00 I— j , ♦ 0 j 

10*500 -0.32330E-00 I 1— ♦ '•- - J 

19*000 -0.34619E-00 I - - — TiT -.V ~ " "v ' '\ ' 'J 

19*500 -0.34930E-00 I J * ^ v'- f 

20.000 -0*3 3608E- 00 I ZT^Vm^^^f ! t6\ a * J 

' 20.500 -0.31008E-00 \ 

21.000 -0.27480E- 00 I —————— -— ^^^Tf*r^^f ' S. " 1 

21*500 -0*e3352E-00' J It— — — — — * J 

'22*000 -0-18918E-00 I- — ---———--——-——-—--♦ J 

22*500 -0*l4434E-00 I—-— —————•————♦_. »• 

23*000 . *0* 101 L2E*00 I — — — ♦ I 

23*500 -0*61 176E-01 I-"" T * * - ! . 

,24*000 -0.257 15E-01 .X>— — *——*—-—-——♦ f . .... J 

.24.500 0.44753E-02 l'"' — T I 

25.000 0*88987E*il I ', mmt T r mmm * 1 

" J Fig. 7. BIock/CSMP output for nonlinear spring example. 

piler. See Chapter 2 on Programming Languages/and Systems for a 
discussion of interpreters and compilers. 

A model programmed in 360/CSMi* is conceived of as haying 
three distinct sections : Initial, Dynamic, and Terminal. The initial 
section is used for any computations which should be performed 
prior to solving the actual system of equations.-Valiies 6f< param- 
eters and initial conditions 0 may be computed in terms of more 
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CONFIGURATION SPECIFICATIONS 



1 U 

2 1 

3 X 
T4 U 

5 I 

6 X 

7 X 



2 
3 



3 
3 



2 



7 
0 
2 
6 
0 
3 
2 



3 
0 
0 
7 
0 
0 
0 



- INITIAL CONDITIONS AND PARAMETERS' 



1 0. 8000000 

2 300. 0000 

3 0.0000000 

4 O 6000000 
3 300.0000 

6 a ooooooo 

7 0.0000000 



-0. 2000000E-03 -0. 1060000E-01 
1. 000000 0. ooooooo 

o. ooooooo 

-a 400OOOOE-03 -a 2000000E-02 
1. 000000 0. ooooooo 
0. ooooooo 
0. ooooooo 



INTEGRATION INTERVAL- 0. 1 0000 TOTAL TIME- 14.000 

PRINT INTERVAL" 0.30000 . - 

OUTPUT BLOCK- 2. MIN. It MAX. » 0. 00000 300.00 



Fig. 8. Block CSMP summary input for the population dynamics, example. 



basic parameters and data needed for the model may be read in 
from a peripheral device at this time. Since preliminary calculations 
may not be necessary for some simulations, this section is optional. 
Figure 9 contains the 360/CSMP code for each of the two examples 
presented. Notice that in the initial section variable names are 
associated with the basic parameters (using the PARAMETER 
statement) and, for the nonlinear spring problem, the nonlinear 
function is defined (using the FUNCTION statement). 

The dynamic 'section of the program contains a complete de- 
scription of the system dynamics. In this section the block diagram 
description 6f°the problem is a mixture of 360/CSMP and Fortran 
statements. For instance, in the nonlinear problem the statement 
X2DOT= -(D*XlDOT + F)/M is an ordinary Fortran state- 
ment and corresponds directly to. equation (4). The other state- 
ments in the dynamic section use the 360/CSMP functions 
INTGRL and AFGEN. These correspond directlyto the integra- 
tionand function generator blocks in Block/CSMP. The large sej 
of such functions available to the .user is specified in [9]. These 
include, for instance, various logical functions which allow the user 
to alter the structure of the model dynamically. Note that unlike 
the Block/CSMP language there is no need to specify block num- 
bers. All interactions are defined by the use of common variable 
names. These names may be selected by the user for their mne- 
monic value. Furthermore, the equatioaiorm of the input is much 
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•••• •CONTINUtfUS SYSTEM M00ELING**»V • / ■ t 

•••PROBLEM INPUT STATEMENTS*** 

••• NONLINEAR SPRING PROBLEM ••• 
INITIAL ■ * . 

PARAMETER M-5. ,0-2. 

FUNCTION AA— 10. , - 1 oa w -9 i . -6« . .-6,. ,-36 

-*/■ -16., -2., -2., O.,0.j 2., 2., h. ,16.,... 
6 . ,36 . , 8. i64 . , 10 . . 100 . 
OYNAMIC . 
X200T»-(0«XiOOT+F)/M 
XlOQT«INTGRL<0. ,X200T> % 
" X-INTGRLC-lC.XiOOT) 1 

y F.AFGEN(AA.X) * 
0 'METHOO' AOAMS . . , 

TIMER DELTn.l,. FlNTIMa2^. p OtlTOEL«.5 
PRTPLT X 
. .. PRINT X,XiOOT.X200T,F 
e ^^END- ; 
:■ -• STOP 



'H 'r* ••••CONTINUOUS SYSTEM MOOELING PROGRAM**** 

•••PROBLEm/'SNPUT STATEMENTS* •* 

POPULATION' OYNAMICS PROBLEM ••• " 
INITIAL x 

PARAMETER Al-.O, Bll-,0106, 812-.0002, NllC»300. 
PARAMETER A2-.6, B22».0004 p B21-.0020, N2lC>300. 
OYNAMIC 

.NlOOT»Nl*(Al-Bll*Nl-Bl2 , N2) 
N200T-N2*(A2-B21*N1-822 , N2) ' 
: Nl»INTGRL(NlIC p NlOOJ) 
N2»INTGRL(N2lC,N200T) 
TERMINAL 

NlOIF»Nl-NlIC ■ . 
N20IF»N2-N2lC 

WRITE (6,100) N10IF.N20IF ■ 
100 FORMAT <• NlOIF«\Fe.3., ' N20 IF* 1 , Fe. 3) 
METHOO MILNE 

TIMER OELT+.i, FlNTJM-14., 0UT0EL»;5 
PRTPLT N1.N2 

■ ENO r~ ' 

v STOP ■ ■; ' : V 

Fig. 9. 360/CSMP pro-ams for two examples. 



closer in appearance to Uje fqrrrji of the differential equations 
making this code fairly easy to read and understand/ ••• 

The terminal section of the program contains computations 
whi<?h may be desired after completion of each run. For instance, in 
optimization problems, the terminal section might include the opti- 
mization algorithm. This section can, initiate rerunning the simula- 
tion with altered parameter values. The section is not always 
needed and has been omitted from the nonlinear spring problem! 
In the population dynamics problem it is used to calculate and 
print the difference between initial and final sizes of the two popu- 
lations. .* * " 

341 ■ ;:,v;'/' • ■ 
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A number of -other control statements 1 will be noticed in the 
programs. These have the generatttinctidWof controlling integra-. 
tion type (eight integration algorithms are available, or the user can £ 
supply his own) and step size, and variables* to be p'rinted and 
plotted. The specific functions can easily be inferred from the mne- 
monic statement names. r 

3. DISCRETE PROBABILISTIC SYSTEMS SIMULATION 

3.1. Introduction. Discrete probabilistic systems are thpse in 
which the state variables change at discrete' instants of time, and in 
which some of these changes occur in a stochastic fashion. Queue- 
inglsystems are an example of this. Typically such systems contaip 
the following elements: , . 

1. Customers or items which arrive into the system. Associated 
with arrival processes are statistical distributions which describe 
the probability of different v interarrival times an4 perhaps of 
various customer types. 

2. Resources which in some sense service the customers. Associ- 
ated with these resources are statistical distributions which describe 
the probability of different service times 1 

3. Queues which hold customers waiting to use a busy resource, 
or waiting for some general system state to occur. Associated with < 
these queues is a capacity which determines how many customers 
the queue can hold, and a queueing discipline which determines the 

' order in which customers are removed from the queue. 

4. System routing paths which determine how customers move 
through the system, of queues and resources. Associated s with these 
paths may be statistical distributions or logical conditions which 
determine what path a customer will follow. 

Note that many of the elements, described above may be func- 
tions of the system state.^For instance, the statistical distributions 
or routing logic may change as the number of customers in the 
system changes. *' '■ ■ ; » 

The general queueing system described pan be used to represent 
a wide variety, of situations. Customers arriving at a supermarket 
checkout counter, products being fabricate^ in a factory, airplanes 
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arriving and departing from an airport, paperwork flowing through 

a corporation, phone calls being processed at a central switching 

.facility, and patients arriving at an emergency room are a few 

" Sample situatipns which can be modeled as a general queueing 
system. Other situations, such as military armed combat and the 
stock market do not fit* the queueing model very well.. They can 
" often be modeled using other discrete event ^probabilistic sirtiula- 
> tion^methods. The lack of obvious structure in such systems, how- 
ever, \requiyes that a good dea| of attention be devotee! to their 
description. Therefore 0 , given the. limitations of a single chapter, the 

v example 0 considered later is of a simple queueing system; 

, Once the queueing- system, example has been presented, the fol- 
lowing two sections- will be devoted to implementing the queueing 

. model in two popular discrete event simulation languages. The first 
language, GPSS, .General Purpose Simulation System, v is probably 
the most 'widely used language of its type. It is easy ; to get simpler 
models implemented in the language and it as available on most 

° IBM computers. The version considerecl is called GPSS/360 (12). 
The secbhd language, SIMSCRIPT, is extremely flexible with" the 
simulation aspects of the langj^ge being embedded in a full-blown 
. general purpose higher-level language. The language is Harder to 
learn initially than GP$3;-howeveri that is perhaps to be expected 
as tfiq,prlce L of flexibility; \ V , 

A major difference betweein the languages relates to their general 
"world view." GPSSjs a transaction or particle-oriented language in 
that .{-he focus is oh thp entities (i.e., customers or items) which 
move'thfough the system. The simulation follows these entities as 
they move, from one activity .(facility) to another. SIMSCRIPT, on . 
the other hand, is an event-oriented languageip that the focus is bri 

; the activities . aftd on the events which defin^the starting and finish-, 
ing times of these activities. The simulation in this Case foilpws^be 
progress o if the various activities defijrfng the model. This differ$fice v 
will become cleareras the: languages themselves are discussed. Ref- 

T . ererices [27] and [283 discus$x>tner language differences. , 1 ; 

;X An important aspect of discrete- probabilistic simulation ccfncerns 

* ttie*generatibp of randbit) jiumbers frorn various distributions, the- . 
testing of such generators, and th& general questions of statistical 
convergence arid experimental design. TJhe numerical techniques 
-involved here correspond; in irriportance to the integration ^Igo- 
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; rfthm, step°size, and stability considerations which form the nu- 
mericaF.core of deterministic continuous system simulation. These 
questions are examined in some depth in references [22], [23], and 

3.2. Discrete Probabilistic System Example. The example con- 
sidered here and illustrated in Figure 10(a) is that of a batch- 
& oriented computer center. Customers arrive at the computer center 
V with! mean rate A (customers/minUt%^iy the interarrival times of 
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Fig. 10- Discrete probabilistic system example. 
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customers is exponentially distributed. That is: * 

F T (t)^ Pr(T ^t)=l-e^. t £ 0 (8) 

where Pr(T <> t) is the probability that the interarrival time T is less, 
than some time t. This is equivalent to saying that the arrival 
process is Poisson. Note that {his assumes that an infinite pool of 
customers is available. ' v" • . * 

4 Associate with each customer a class type so that all customers 
that arrive are said to be in either class A or class B. This repre- 
sents an attribute attached to each customer entity. Let XA percent 

9 of the customers be of class A. 

Each arriving customer is told how many jobs are ahead of him 

v in the queue. The customer, on -hearing this, now decides whether 

* or not freewill join the queue. If the queue length (i.e., number in the 
queue plus the number being served by the computer) is greater 
than sonje value, the .customer leaves, never to be heard from 
again: He : ttius engages in a form of queueing system behavior 
called /'balking.^ If the queue length is less than or equal to this 
value, the customer joins the queue and once in the queue must 
remain in the system until his job has been run by the. computer. 
That is, he may not engage in a form of queueing system fifehavipF 
called "reneging." . r . v v ^ 

The queue. kself is taken to (be infinite in length. Note, however, 
ttfat given the strict balking behayior described above, it need not 
be infinite. Indeed, this type of balking behavior can be modeled by 
assuming a finite queue with customers lost to :th$ system if they 
■arrive when the queue is filled to capacity. The queUeing discipline 
is taken as FIFO (First In First Out) with the- first customer in the 
queue proceeding to the computer whenever the computer becomes 
free;: '■ .. „' • . / ' v \ . ; . ■ 

v a The computer in this model can only handle a single ,custom'er 
job at a time. Orice'tfjat job enters the computer,^ rjuns to com- 
pletion without interruption and leaves the, computer .immediately 
on ending; The running time or . service time associated. With the 
jobs is takenlo be uniformly distributed. from T — 20 to T + 20. v 
The mean is-thqrefore TImihfftes/jbbj'andinl 

. dent on the class attribute of the job. Thus: : -'-~-'r\~' 4 - 



TV for. Glass A jotfs, 
for Class B jobs, 
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and the service time distribution is given by: ^ 

SO for r <T - 20,. 
(r-T + 20)/40 for T - 20 < r < T + 20, (9) 

Assume that the "system remains in operation for long periods of 
time so that start lip.^ariB shut down effects can be ignored. : 

Given this system description a number of items might be inves- 
tigated. From a customer service point of view, mean waiting time 
(response time) in the\sysfem is an important performance measure 
and might be examined as a function of mean arrival rate, com- 
puter service rates, class distribution, and balking point. From a 
computer-center manager's point of view, computer utilization (i.e., 
percentage of Jtime the fconjputer is busy) and job throughput (i.e., 
the rate at which jobs are processed) are important measures which 
are also functions of the yarious system parameters. Changes in the 
system involving the establishment of a priority discipline based on 
class type rather than the* FIFO discipline might also be invests 
gated with a view toward improving, for example, the mean-waiting 
-time; \ •' •/;••,, ■ - " 

In the sections to follow this system is represented and solved in 
the GPSS and SIMSCRIPTT languages. 

3 J. The General Purpose SiAulatioh^ GPS§r 
is a block-oriented language. The language contains over forty dif- 
ferent "block types wfiich perform ^ctiyitie^rcommonly found in 
discrete probabilistic models (e.g., generate air arrival process, place 
customer in a queue, etc.). These blocks often allow for easy trans- 
lation from a flow-chart representation of system into GPSS pro- 
gram. Thus a clear correspondence is seen from Figure 10(a), which 
represents the example system, and Figure 10(b), which is the 
GPSS block diagram. Short definitions of a few of the more im- 
portant blocks are given in Figure 11, and a complete definition of 
all blocks can be found in [12] and [13]. 

Figure 12 presents a GPSS program which corresponds to the 
blocks of Figure 10 and solves the example problem. The program 
begins with a number of comment cards. These are designated by 
an asterisk in column 1. The first noncomment card is the GPSS 
control card SIMULATE. This card indicates that an actual simu- 
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Block Type " Block Symbol General Function ■ 


GENERATE . 


.-•'I 


A,B» ' . l 


Create transactions with mean Interarrlval 
time given by A, and distribution specified 

b. '■■ . 


TERMINATE 


o 


Removes transactions from the system. 
Transactions entering the block are elimi- 
nated. A la subtracted from termination 
count value (specified by START coaoand) . 


SEIZE 

f 




k 


If facility A Is not In use, a transaction 
entering the SEIZE block will cause it to , 
become "busy". Any other transaction now ' 
attempting to enter this facility Is halted. 


RELEASE ' 






Hegatea the effVct.of the SEIZE block 
causing faclllty\A to become "not busy". 














QUEUE/ : V- 








Arriving transaction Is placed in a queue. 
A -of unlimited. alze. 




"■ . . 


& 








V^DEPART 








A transaction la removed from queue A and 
sent out of the block. A FIFO discipline 
Is used. . "* 














Advance ■ 




. . \ 


A transaction entering the block Is delayed 
for a period of time before departing. 
The mean delay time is A, and distribution 
is specified by B. 

. ■■ ■ - ■>;'■ 


Transfer 




A' is a number from 0 to 1. An "entering 
transaction will proceed to the block 
labelled C with probability A, and to 
block labelled B with probability 1-A. v.* 



Fig. 11. Short definitions of some GPSS block types. 



lation run is to be made. If this card is omitted, theh the GPSS/360 
compiler will only span the input for coding errors. 

Following this is a function definition card. EXDIS is a label 
which the user has associated , with the function to be generated. 
. RNl indicates that the output of random number generator 1 (one 
of eight available in GPSS/360) is to be used as the independent 
variable in the function evaluation. That is, every time this function 
is used, a uniformly distributed^rahdom number from 0 to 1 is 
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Ml ,0.20,40 , • 




4 00 




END 





BLOCK 

NUMBER *LOC OPERATION A,B , C, 0 ,E , F ,G COMMENTS, 

• h ' 

• SIMULATION OF A BATCH COMPUTER SYSTEM 

• . • >' ' 

SIMULATE 

• . . 

EXOIS FUNCTION ■ RNi,C24 NEG. EXPON . .DISTRIBUTION 

0,0/.!,. 10*/. 2 , , 222/. 3, • 355/. 4 , .509/. 5, . 69/. 6. . 9 1 5/. 7 , I . 2/ . 75 , I . 38 
. B, I. 6/. B4, I. B3/. 88,2. I 2/. 9. 2. 3/. 92. 2. 52/. 94, 2. 8 I/. 98. 2. 99/. 96, 3. 2 
1 . 9 7, 3. 5/. 98, 3. 9/. 99 ,4. 6/. 995, 5. 3/. 998, 6 • 2/. 999, 7/. 9998, 8 

INTERAftRIVALS EXP.OIST; 
IF QUQUE LENGTH LT 3 ENTER QU 
PLACE JOB IN QUEUE 
USE THE COMPUTER 

BO X CLASS A, 20 X CLASS B 
CLASS* A JOBS, MEAN 40 SECS. 

CLASS B JOBS, MEAN 60 SECS. 
LEAVE COMPUTER SYSTEM 
TABULATE TRANSIT TIMES 

RUN 400 TERMINATIONS 

Fig. 12. GPSS/360 program for computer system example. 

" : •" ' "' .' ' . ' . ' • ;. " \ ■ \ ' " . * 

generated for use as the independent variable. Nbte that this inde- 
pendent variable can be any one of a number of "Standard Nu- 
merical Attributes" which are available in GPSS (e.g., queue length, 
facility, utilization, clock time). The second operand C24 indicates 
that the function is continuous, and 24 value pairs are to follow 
which define the function. This particular function is the negative 
exponential^ tribution as indicated in the comment ^ the right 
side of the statement. The 24 value pairsjollpw on the next, three 
lines. GPSS performs a linear interpolation between the points 
when generating a value for a particular RN1, This particular func- 
tion, as defined, allows one to generate random variates from the 
exponential distribution. This represents an implementation of the 
inverse transform method for generating random variates [22], 
[28]: . 

The function is used in the GENERATE block statement which 
follows. The GENERATE block creates the transactions which 
move through the system. In this case these transactions corre- 
spond to computer jobs. The first operand indicates that the mean 
time between creating a transaction is to be-60 time units (seconds 
in this case). The second operand indicates that the Junction 
EXDIS is to be used to determine the distribution associated with 
this arrival process. This situation therefore results in the gener- 
ation of transactions with exponential interarrival times. Other 
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capabilities of the GENERATE block include associating attribute 
values with each generated transaction. 

Transactions created by the GENERATE block next enter the 
TEST block. This block controls the flow of transactions on the 
basis of an algebraic comparison pf two attributes. In this example 
the L in Test L specifies the Less Than condition. Equal (E), Not 
Equal (NE), Greater Than (G), Greater Than or Equal to (GE) and 
Less Than or Equal to (LE) are also possible. Operand' A 
(QSLQUE) is compared with operand B (3) and, if the comparison 
^ condition specified does not hold, then the entering transaction is 
routed to the block specified in operand C (OUT). If the condition 
does hold, then the entering transaction is passed through to the 
next sequential block (QUEUE). In this example operand A speci- 
fies the standard numerical attribute ? queue length for the queue 
labeled LQUE. Operand B is the constant 3./Thus if the qtfeue 
length is less than 3, the transaction passes to the next block; 
otherwise it leaves the system. This, in effect, implements the 
balking-customer characteristic discussed previously. 

The QUEUE block which follows acts as a storage or waiting 
line for transactions. Such a storage area is necessary if a facility, or 
piece of equipment, is in use. Operand' A (LQUE) specifies ? a sym- 
bolic name for this queue! This name can then be referenced as in 
the TEST block to obtain various queue parameters. Thus the 
generated transactions now enter the queue if its length is' less than 
three. 

In order to simulate situations in which equipment is used, or in 
which there exists a single server, GPSS provides a facility entity. 
Such an entity serves, a single transaction at a tim& and when 
providing such service is busy (i.e., 1 occupied). Any transactions 
which may desire to enter at this time are blocked. One way a 
facility is defined -is by the- presence of a,symboHc name or number 
in a SEIZE block. COMPU is a symbolic name defining^ facility 

V;< which in this ca^ represents the computer. A transaction, oh en- 
tering a SEIZE block^ causes the facility (Ct)MI$J) fb" become 
busy and prohibits other transactions from ejiterifig^ntil the facili- 
• ty is RELEASEd. The RELEASE block moves the transaction^ 
; the defined facility to the ne^Tblock and causes the facility to now 
become not busy. Thus the transaction at the head of the queue 
LQUE pontinually^attempts to ent^r th^EIZE.block and even- 

■5; rtualiy succeeds w|ien the COMPU facility is RELEXSEd. ■ • * - >y 
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Once the transaction has SEIZEd the facility, it must relinquish 
its position in the queue. This is done with the DEPART block. 
Operand A (LQUE) of this block specifics the queue frcjm which 
the transaction is departing. This causes the number in the queue 
to decrease by 1 and may therefore affect the direction of subse- 
quent transactions which flow througlHhc TEST L s block. 

In ordci* to simulate the presence of two classes of jobs, the 
TRANSFER control block is used. Operand A of this block is a; 
numbejr from 0 to 1. An entering transaction will, proceed to the 
block named in operand C (ABLK) with probability A (.8), and to 
the block named in operand B (3BLK) with probability l-A v That 
is, operand A indicates* the proportion of entering transactions 
which go to the block named in operand C. In the example 80 
percent of the jobj are taken to be in class A and 20 percent in 
class B. 

The TRANSFER block moves transactions to one of two AD- 
VANCE blocks. The ADVANCE is used to simulate the passage of 
time, in this case the execution time of a job. The transaction 
entering the block is delayed an amount of time specified by oper- 
ands A and B. The delay time here is a random variate whose value 
is generated from a uniform distribution with mean A, and mini- 
muip and maximum values A - B and A 4- 6, respectively. Class A 
and class B jobs thus represent jobs w^th different mean execution^ 
times, 40 time units for class A and 60 time units for class B. 

When the transaction leaves the ABLK ADVANCE block, it 
enters an unconditional TRANSFER block which moves it to the 
CEND RELEASE block. This is the same block -which is entered 
by; the transactions leaving the BBLK ADVANCE block. The 
RELEASE block releases the computer facility COMPU. At this 
point if there is a transaction waiting in queue LQUE, it can 
SEIZE the computer. 

Af^£ the RELEASE block, a TABULATE block is entered. This 
is used to gather certain statistics which are not automatically 
collected by GPSS. Operand A (TRNTI) points to a TABLE defini- 
tion statement which must be examined to understand the TABU- 
LATE block. The TABLE statement has four operands. The first 
(Ml) is a.code which indicates that GPSS should gather statistics 
,pn the transit time of transactions entering the TABULATE block. 
j£j?e transit time is .the time from transaction creation to entry into 
the TABULATE block. In this case this represents the time that a 
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job stays! in the computer systcjn, TKc; remaining .three operands (0, 
20, 10) indicate the lower' limit", interval size, and nuriiber of inter- 
vals to be used in the statistics gathering. 

The final block, the TERMINATE block, removes transactions 
from the system, and thus simulates the departure of jobs. The 
operand value is used in conjuction with a "termination count," 
The termination count is specified in the operand of the START 
control card which follows. This count is initialized to the START 
operand value (400) and is decremented by the TERMINATE op- 
erand value (1) every time a transaction enters the TERMINATE 
block. When the count'is equal to' zero, the simulation run is ended. 

In addition to specifying the termination.courit, the START con- 
trol card indicates to GPSS that the set ofinput cards necessary to 
execute a simulation has been received and that the execution 
phase should proceed. The END control card is that Mast card'of 
the GPSS input deck, Note that, although this example has only a 
single execution associated with it, multiple simulation runs can bV 
accommodated with altered parameter values or model structure, 

The next item to consider is the output produced by the GPSS 
program discussed above. Although the output produced runs over 
several pages, it has^beeri 'condensed and presented in Figure 13, 
Some of this output represents statistics automatically gathered by 
GPSS, while some of it (TABLE TRNTI) represents statistics re- 
quested by the program through use of the TABULATE block, 
Not all of these statistics will be discussed here. The first main set 
of statistics represents a count of transactions which have gone 
through each block. The block numbers are^roduced by the GPSS 
compiler and correspond to the numbers on the left side of Figure 
13. Notice that blocks 7 and 9 represent the different number of 
class A (301) and class B (75) jobs which went through the system, 
Also, the difference in totals represented by blocks 2 and 3 (403, 
379) indicate the number of jobs which turned away~from the 
system because the queue length was greater than two. Thus about 
6 percent of the potential customers were lost. The remaining sta- 
tistics are fairly clear. The computer (Facility COMPU) was busy 
68 percent of the time and the average time each transaction used 
the facility ^was 44.75vtime units. The queue has an average length 
of .556 and the average transaction spent 36.329 time units in the 
queue. The table TRNTI indicates the distribution of overalf wait- 
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ing times of jobs which enter the computer system,; with the mean 
(80.845) and standard deviation (42 .375) explicitly noted. 

This concludes the discussion of GPSS, The section to follow 
considers the SIMSCRIPT language which views the world of dis- 
crete probabilistic simulation from another perspective. 

' ,■ *»v ■ 

3.4. The SIMSCRIPT Simulation System. SIMSCRIPT (15) is a 
general-purpose, higher-level language which has most of the capa- 
bilities of FORTRAN jn addition to providing list processing and 
discrete' simulation facilities. The version considered here, SIM- 
CRIPT IL5;is documented in [30]. i 

Before considering^ SIMSCRIPT implementation of the cxamr 
pic problem, a numbcr^of concepts central to tnje. SIMSCRIPT 
word view n^ust be explained. Some of these are^simijar tp ideas , 
presented in Section" 1.3. SIMSCRIPT considers Systems to be. ; 
composed of erUities which have associated wi(lThfhem attributes, 
Entities may- be dcfincd4rf a number of ^ays. Consider, for vin- 
stancQ, th^cntity called JOB, which has attributes, TMtEAN and 
ARR\TI (arrival time). SIM S(£ Rl PT has provision for defining thfc r 
general entity JOB as follows : * : 1 t 

^ EV^ERYJOB HAS, A 1 1^ E AN ANp AN ARRTI 1.*.: 

EVERy^te a reserved word and has" a specified staternent^ format 
associated wjth ft. ■* ' , 

Two .types of entities are permitted in SIMSCRIRT.^ Tfirnporary 
entities 1 are used for those entities whic)i^j5 .created 'or destroyed 
^during the course of the simulatibn.^jOfis may be thought of as 
e n tering* (C RE ATE a' JOB) and leaving (DESTROY, a JOB) the 
system. ^By^ssociiating them with temporary entities, storage can be 
dynajriically allocated to thesVJOBs. Assigning JOB as a'feiftpor--: 1 
vary entity. With 'the TME^ and ARR.Tt attributes is done with 
the following statements: > v : -.^v/v -r /"i?"JL-.'.v**- 

; ' ' , -J : TEMPORARY ENTITIES . 

. .... v"* EVERY JOB HAS. A TMEAN ANE> AN A RR-TI >* w.'ffi 

• The other entity, typp, permanent eptiifc may be used^to define those 
entities which/will not be individually created dr destroyed ^u.ring, 
;the simulati6n. Su^h entities are. Stored ""by SIMSCRIPT collectively 
and/ate used to represent relatively sfatic objects^, which., fem^in 1 
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present in the system for the duration of the simulation. For 
example: . ' 

PERMANENT ENTITIES 
EVERY SCHOOL^HAS AN ADDRESS 

It is Usually important to be able to collect entities into groups 
or sets and to provide a means for entering and removing entities 
from such sets. SIMSCRIPT allows the user to define sets with the 
DEFINE SET statement, and to enter and remove entities from 
sets with the FILE and REMOVE statement. For instance, defin- 
ing the symbol JOB.Q as a set which is in fact a FIFO queue can 
be done with the statement : 

DEFINE JOB.Q AS FIFO SET 

In order to link entities with sets, provisions arc available for in- 
dicating that an entity belongs to a set. If a JOB is to be able to 
join the JOB.Q, then the following expanded definition of the JOB 
entity is needed. . 

TEMPORARY ENTITIES 
EVERY JOB HAS A TMEAN, AN ARR.TI 
AND MAY BELONG TO A JOB.Q 

Sets themselves must be associated with entities in the sense that 

the set is OWNed by some entity. For instance: 

■ • *■» • 

,EVERY v SCHO0L HAS AN ADDRESS, OWNS SOME 
J^TUDENTS, AND BELONGS TO A SCH.DISTRICT 

Itp^case the entity SCHOOL has an attribute- ADDRESS, In 
siMm^ the set STUDENts is owned by SCHOOL, and the 
^qHtOOL itself is a member of the set SCH.DISTRICT. Clearly, 
complicated set relationships can be established. These relation- 
ships, however, provide for great flexibility in modeling complex 
situations. ' - • ■ / v ' - ■ ' ■••C '.- 

One final point in this discussion has to do with SYSTEM at- 
tributes and set. ownership. The term. SYSTEM refers to the overall 
system being considered. This SYSTEM can itself have attributes 
and own sets.- By Jhavirig attributes, these attribiites^iiow become 
global, variables (i.e., these attributes -can be referenced with the 
same name in different subprograms or subroutines). In addition, 
certain global pointers are now available for getting at elements of 
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system-owned sets. The set JOB.Q will be owned by the system 
when defined by the following statements: 

THE SYSTEM OWNS A JOB.Q 
DEFINE JOB.Q AS A FIFO SET 

The logical relationships as defined above arc static in paturc. To 
model the dynamics- of the system EVtNT ROUTINES and a 
process for SCHEDULING such routines must be available. 
Events correspond to transition points between operations or ac- 
tivities in the system modeled. They represent a change of state. 
Part of the art' involved in such simulations is determining the key 
events which take place in a system. For each of these key events a 
program or event routine must be written. EvcnUroutines have twq^ 
general functions. First, they perform whatever logical operations 
-or calculations are associated with the event. Second, they deter- 
mine what future events arc to take place given-thc current system- 
stated and then they schedule these future events. 

Scheduling an event effectively means thai an event routine name 
and associated future time of occurrence are placed in a list or 
stack. Events in this list are ordered by tim6 of occurence, with the 
first event, being one whose time of occurrence is closest to the 
current simulation time. Note that as simulation progresses a clock 
must be maintained by SlMSCRIPT of the current simulation time, 
(variable name TIME.V). Thus, as one goes^dwri the event list, the 
cvQnts listed arc" scheduled to occur further and further in the 
future. . / , . ' ' 

SlMSCRIPT <and all other. discrete event simulation languages 
provide routines for maintaining the event. list, determining which 
event routine is to be executed next, and transferring control to 
that routine, A RETURN from an evenf^routine returns control to 
the "scheduler" which in turn passes^cbntrdl to the routine for the 
next event to occ^ir. In this way by successively having event rou- 
tines schedule future events, and by paving a "scheduler" routine 
pass control to the succeeding event routines, the dynamics of the 
system arc modeled. ; 

The events themselves are defined initially by using, the EVENT 
* NOTICE statement. For instance: 

EVENT NOTICES INCLUDE ISSUE 
EVERY END.SERVICE HAS A JNAME* 
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Two events arc defined by * the statement above, ISSUE and 
END.SERVICE, Associated with ENb.SERVICE is an attribute 
JNAME which, in the example to be discussed, is used to associate 
a particular JOB with the END.SERVICE^cvcnt. 

Notice that in the discussion above the event routines all oc- 
curred due to event scheduling internal to the SIMJJCRIPT pro- 
gram (i.e., scheduled by event routines), Such cvbht routines arc , 
said to be endogenous routines. It is also pdssibIc4o schedule events 
external to the system by reading in a list of fcvents from data cards 
or other input, Such events arc referred to as external or exogenous 
events. 

Consider next the computer example as programmed in SIM- 
SCRIPT II. 5 and presented in Figure 14. The program is divided 
into four sections. Tho first is the. PREAMBLE in which the 
various entities, attributes, global variables, and set relationships 



9 

10 ' 

11 . 

It-'"' ' 
13 END 



;e\ 



PREAMBLE 

NORMALLY, MODE IS INTEGER 
<■ THE SYSTEM OWNS A JOD.Q 1 

DEFINE JOD.Q AS FIFO SET * - 

TEMPORARY ENTITIES , 

EVERY JOB HA&~*"TMEAN,A ARR.TI AND MAY BELONG TO A JOB.O 
EVENT NOTICES INCLUDE ISSUE 

EVERY ENO. SERVICE HAS A JNAME 
DEFINE TOT. JOBS. DONE. UTIL ANO JNAME AS VARIABLES 
DEFINE ARR.TI, SYS.'TI ANO TMEAN AS REAL VARIABLES' 
TALLY W AS THE AVG OF SYS.TI 
ACCUMULATE UZ AS THE AVG OF UTIL 



'ENO PREAMBLE 



1 MAIN 

2 SCHEDULE AN ISSUE NOW 

3 LET UTIL-O LET TOT. JOBS. OONE-0 

4 START SIMULATION 

5 END • , 



'MAIN ROUTINE 
•INITIALIZATION 
'START SIMULATION 
'ENO MAIN ROUTINE . 



1 EVENT ISSUE v ".ARRIVAL OF JOBS 

2 . SCHEDULE AN ISSUE AT T I ME .V ♦EXPONENT I AL. F ( 60. * 1 ) 1 'SET NEXT JOB ARRIVAL 

3 IF (N.JOB.Q GT 2) RETURN ' "JOB BALKED 

4 ELSE CREATE JJOB "*LET ARR.TI-TIME\V . "ASSIGN ATTRIBUTES 

5 IF- CjUNIFORM.Fco. •'!«.!) LT .8) LET TMEAN"* 0. GO TO T.BUSY 
* ELSE, LET TMEAN-60. 

7 'T.BUSY' IF(UTIL - DFILE JOB IN JOB.O RETURN "COMPUTER BUSY? 

I ELSE LET UTIL - 1 "MAKE COMPUTER BUSY 

9 SCHEDULE AN ANO.SE RVICE (JOB ) AT TIME . V*UN IFORM. F( TMEAN-20 ., TMEAN »20 
10 RETURN ENO * ■' - » * VEND ISSUE EVENT 



• 1 ) 



1 EVENT END. SERVICE(JOB) 

2 LET SYS.TI-TIME.V - ARR.TI 

3 ■* 'LET TOT. JOBS. DONE-TOT. JOBS. DONE ♦ I 

4 - DESTROY JOB 

5 - 'IF (TOT. JOBS. DONE GE 400) PMNT 1 LINE WITH 
MEAN WAIT TIME- UTILIZATION- 



"ENO OF COMPUTER USE 

"CAL.WAIT TIME 

"CAL. NUMBER JOBS DONE 

"DESTROY JOB 
W AND UZ AS FOLLOWS 
• • * • • * 



6 

. 7'-* 
8 * 
9 
10 



STOP 



ELSE IF JOB.Q IS EMPTY LET UTIL-0 RETURN 

. PLSE REMOVE FIRST JOB FROM JOB',0 . 
.SCHEDULE AN END.SE RVJCE ( JOB > AT TtMV. V*UN IFORM .F(TMEAN-20 . ,TH£AN*20 ., 1) 
RETURN END "ENO ENd. SE RVICE EVENT 



• 'STOP SIMULATION 
"QUEUE EMPTY? YES, RETURN 
•NO. REMOVE JOB 



Fig. 14. SJMSGRIPT \l T S programmer computer example. 
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are defined/ Most *6f the statement types have been considered 
earlier in this section. Another function of the PREAMBLE is to 
^ request that certain statistics be gathered automatically by SIM- 
SCRIPJ. In this example the TALLY statement is used to associ- 
ate the variable name W with the average value of SYS.TI. SYS.TI 
V v : is a variable whose value is th$, total waiting time for a job. This W 
*■ is calculated as a simple average of SYS.TI over all jobs which pass 
through the system. The- : ACCUMl)LATE statement associates the , 
variable name^tJZ with the average utilization of the computer.' 
^ Every time UTIL, is s.et to 1 (computer busy) or 0 (computer not 
busy) the appropriate statistics are gathered and a time average of 
the UTIL busy parameter is maintained. ', 
" V * The second section is the MAIN 1 program. This section schedules 
. the first everit to occur (ISSUE) at time NOW which is the curftnt, 
Slock time : of the simulation. It then initializes several parameters 

* and then transfers control to the.event scheduling routine with ttte 
command START SIMULATION. . 

^Since an ISSUE event has been scheduled/control will pass to 
. : the EVENT ISSUE routine? EVENT ISSUE handles the event of 
i - job arrivals. It first schedules itself ^i.e.y another job arrival) for a 
tirne in thtf future which is exponentially distributed. It then deter- 
^, mines . if tlie queue (JOB:Q) has too many jobs in it and, if so, 
'i'l^iurns control, to the scheduler. If not, a JOB is CREATED and 
it s| attributes assigned; At this poinf a test is made to see if the 
.v.. computer is busy. If it .is busy (UTIL = 1) the JOB is FILEd in the 
JPB^Q/ If it i? not busy, it is made busy, thus modeling the action 
, orthV JO on the computer. The end of this execution 

'-;tjme, indicated r by the EtyD.SERVICE event, is determined and the 
^ • END.SERVIGE e ven scheduled, the scheduling is done in accbrd- 
^Jahce with the class type : for the JOB. .•.«'". 
i* ■■{■■■'v.tte-iinal' routine is the END.SERVrfCE event -routine, The rou- 
^ flnie first calculates the wait time (SYS/TI) for the JOB that has just, 
^ finished execution and the number of Jpbs that have gone through 
, . r the computer (tOT.JOfiS.DpNE). The job which has just finished 

• • service is DESTROYED at tbfts point, releasing the storage space 

associated^itfi it. On.4he b^is'of TOT.JOBS.DONE the simula- 

: tion ma^ the gathered statisti^ printed: If 

: ; v^lhe ^(t^^M^is not terminated, andj^here is a t |Q : B available in 

5 : " the JO^^p^v,|lii's JOB begins tc^ecute on the computer and 

• ::-- ;,; .>;V:;.«^^\>-". .. : v yj /:v - vV '' ' " 
- Q rr •••.>• ■ ' • 
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an END.SEItVICE event for the JOB is scheduled. Otherwise con- 
tcoj is returned to the scheduler.. In: this example 400 jobs pass 
tHrough the system before the simulation is terminated ^On terming 
ation the mean wait time and' the computer utilization is printed,- > 

4. SUMMARY • * ^1:0$ 

This chapter has considered the general topic of simulation on 
digital computers. The first section emphasized the^broad questions 
of modeling and simulation methodology. The second section con- 
sidered two language systems used for modeling of continuous sys- • 
terns, while the third section considered two language systems used ■■ 
predominantly ^for modeling discrete probabilistic /systems. The em- 
phasis in these later two sections has; been on preser&i^ some 
typical, but simple, systems and demonstrating the language system 
capabilities by modeling these system^ The reader is directed to the 
references for details on numerical questions related to model solu- 
tion arid fora wider view of these and other simulation systems. 

; ..• : ■ - . ■■ •;• . , • h .'V..-' 
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Jf> ^stCi3sihg, the iipf)£ct : of high-speed , ^ 

data an aly sis is* difficult! The issues in vol ved sp& ri :.virt yaHy^ycifyV" : 
'thing st&tis^icip v hs-^a-9d Kow they;dcr it. In part,. genial statistical 

methodblojgy' is motivated, by purj^es^ our'^ A 
r^cfsion makirtgVnd our ability to derive" new, kno^dgo^om, ex- 

pefien^es, Since ^experieiices' are *g&nerat£ci in a variety^ settings 



settings 

^nd sre based bti a^humber ;of;di|Ter^)nt sutetr^te^Cji.^s been 
desirable to develop, tools i.^fc^^^de^ing^^itti ^diversity - a^cl hetepor 
gejieity v ;In*a" typical e^riment, a number of substrates are ex- 
posed.; tp^a ; stimiuliis . ahdr butcdiries artf' obs&:yed. The. Outcomes 
reflect the s tijnujiis, ^ the difl^rehce^ among substrates, of- both* One^ 
task bf &atist[cs is- Vo separate 1 that ,parfc of the outcome due to the 
stimuluis from that due tp^-subistr^ 

irivofv§s st.udyirig the ' in Veraction ^ between many siiljstrates and the 

, ,Y Such studies^ traditibiial producers^pf sizable amounts of data, 
steadily have been increasiftg 'tjieir size and scope in response r to 
- technplbgical advances that have^facilitated data acquisition; The ' 
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process is accelerated even further by the availability of autpmated 
computational procedures, so that studies currently considered 
routine* were well v beyond contemplation two decades ago. How 2 " 
,ever, the desire to .exploit this growth in an ^(Tort to improve our 
observational acuity brings with it a serious risk o^overstressjng 
the discriminatory facilities Inherent in hfatty statistical techniques^ 
Thus, the IntfoSuctiOn of automated large-scale statistical pro^ 
cedures brings into question the wa N ^., we. design . observation*; 
gathering ventures and the way we deal with the results. This effect 
often 4s obscured by the amount of sheer computational power at 
the analyst's disposal. ? , 

• The approach to dat^ analysis taken i^this article is motivated 
by just this availability. As implied^'above, such a resource is helpful 

*i& extending the bourt^aries associated, with tboth the volume of 
data? to be analyzed arici the computational complexity of the 

f analysis: In the gasCcoinputation was, difficult, and^hus the (ievel- 
bpfnert^of statistical methodology frequently was associated with 
.numentja) tricks or shortcuts. There appeared statistical codk-, 
boolcs, comparable to the classic Joy of Cogking, that described 
aV#Hable computational formulae ("recipes") and example prob- 
letps'to which these formulae were suited. In current textbooks \ve 

> stifi fiijd cfiapjers cpn.such topics as linear regression, t tests, one- 
way knaly^is^fy^n (anova), one-way anova with a covariate, 
two-way linQ^wh^hoXi;? interaction, tvv^o-way anova with interac- 
tion, etU,^ith^accompanying descriptions of computational short- 
cults. The yiofume of material required to present thejse methods and ^ 

Stheir associated .fexamples.is considerable; As: a result there often is 
pnl^^^ limitations afnd restrictions for valid 

- iise of each r prpceciure. , ; , " L » 

^"■^^e|iVaHa'ibility of-reajly inexpensive compu^injg promises tb 
change all this; One can^erceive jwo distinct dihiensions of this; 
potential of computational complexity 

^ the applicability of a particular statistical 

technique in^a gi^eb context carf be tested Routinely and automati- 
cally. Secortd,^the^ design of an^experiment need not , be com- 

v grb considerations. This met$horpl<ibsis, 

iiowpyeir, >i£ "-Q^rrfng^ slowly;;- but iy is in progress and we can 
diSciiss itsj impiicatipris^ C that $ccuratfe r executipn 

• V of complex- computational algorithms is rou tine, the time has come 
tcr change our perspective of data analysis, from the coifop;utational 
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issues to those issues 'characterizing the data source itself. In this 
article we shall concentrate on various models of data sources and 
we shall investigate how automatic computation can aid ip testing 
hypothesized properties oXlhese models. We shall take as given the 
notion that these computational facilities significantly extend both 
the limits on the amount of data analyzed and the "permissible" 
> scope of the analyses to be applied. ^ 

J^ATA SOURCES 

Before investigating data models and computational methods, it 
will be helpful to review data sources as yieWed' from basic physics 
and biology. The physical and biological sciences have demonstra- 
ted (to my satisfaction) that properties and derived functions reflect 
internal structure. Properties are derived from the particular con- 
figuration of atoms and bonds between them. As a simple example, 
i acetone and propionic aldehyde have the same chemical compo- 
sition, G 3 H 6 0, but different structures (Figure 1): 

H 0 H ^ 
i it : i ■ — ■ ' 

C - C - C - H 

' ' ...... 

H H Acetone 

H m H 0 

i * i . ii ' - . ■ ■ . 

C - C - C. - H 

H H Propionic Aldehyde 

v Fig. 1. Two-dimensional representation of acetone and propionic aldehyde. 

For these two different structures, the following properties are ob- 
served: \ 

Acetone Propionic Aldehyde 

colorless liquid 
specific gravity = .72 
'melting point = —94.6 
boiling point = 56.5 
infinitely soluble in H 2 0 



H - 



H 



colorless liquid 
specific gravity = .81 
melting point = — 81 
boiling point = 49.5 
not infir^tety- soluble in H 2 0/ 
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Although the differences, are sm^ll, this example clearly, demon- 
strates the dependence of properties on internal structure. \ 

Heterogeneity of function is deperfdent on both the diversity of* 
available building blocks and the diversity of templates used to . 
define objects assembled from such- building blocks. Man-matfe 
structures seem to^njoy greater homogeneity oflunction than- do 
living structures. This is due in part to Jhe limited number of 
templates we use in synthesizing complex objects, and also to the 
structural uniformity of each building block. With carefully refined 
materials, we are able to make many uniform copies of complex 
objects by applying the same template to these materials. For in- 
stance, all copies of a particular pocket calculator or food mixer 
share a high degree of similarity. 

In contrast, living objects are replicated by applying a highly 
variable template to a set of primitive building blocks. The re- 
sulting mixture of enzymes, substrates, and membranes form living 
cells. Biological replication is based on building a new template for 
each new copy, half of which is derived from each parent template. 
Such a scheme results in considerable diversity of templates. As an 
example of the variety of available templates, consider ourselves! 
Humans have 46 components in their templates, 23 derived from: 
each parent. Thus there are at least 2 46 possible templates. The 
magnitude of this number can be contrasted with the current world 
population of 2 32 (or 4 billion). In addition to the diversity 
achieved from nonuniform templates, environmental surroundings 
contribute to diversity. E^ch new cell that is created evolves by 
responding not only to the control that is resident in its internal 
composition but also to the environment surrounding it. Thus, 
identical cells will evolve into different organisms wheh exposed to, 
different environments. The growth of our bodies from a single cell 
certainly exemplifies this. . />, 

Outcomes of experiments are . determined by both the nature of 
the stimulus and the nature of the substrate. Thus we need to be 
able to characterize accurately andrthen select like stimuli and like 
substrates. Because of deficiencies in measurement, however, our 
accuracy of characterization is limited. Our senWy inputs limit the 
degrees of characterization of cellular aggregates. We characterize 
structure and associated function of cellular aggregates by proper- 
ties that can be seen, felt, heard, tasted, and smelled. We quantify 
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these properties by applying yardsticks that are graduated in units 
of temperature, length, color, sound intensity/pitch, etc. Our indi- 
ces are limited in their precision. Therefore, we are only able to 
differentiate among cellular aggregates down to a certain size com- 
mensurate with our level of perception: Beyond this level, we 
cannot detect any . further differences without running the risk of 
classifying unlike substrates as like. 

As stated earlier, much of science is "done" by observing the 
interaction between substrates and stimuli. Presence of an interac- 
tion will support one line of reasoning while its Wtk will support its _ 
converse. It is important, therefore, to be able ^measure proper- 
ties accurately so that responses derived from different interactions 
can be discriminated. The errors 'associated with our inability to 
discriminate among outcomes, i.e., classifying unlikes as likes, form 
the justification for a statistical approach to data analysis. 
." To fix these ideas, consider a number of patients given one of 
two blopd-pressurerlowering drugs. Our task is to determine * 
whether the drugs are equivalent; hence, we use blood pressure as 
an indicator of th© drug-patient interaction. Measuring blood pres- 
sure for each patient, we find a spectrum of pressures. This spec-' 
trum arises from the fact that the patients aire structurally different 
from each other, and thus, for each patient, the drug has a*different 
substrate with which to interact. Both the substrate and the drug 
determine the observed pressures. Part of the spectrum of pressures 
is due to the heterogeneity among substrates. This portion is us- 
ually referred to as "error" or "noise." The rest of the spectrum of 
pressures is due to the drug-substrate interaction.' Our job is to 
determine, with some level o£ confidence, which of these two ccm-? .... 
ditions is immediately more significant i.e., whether the spectrum 
of pressures is predominantly due to substrate heterogeneity or to 
the drug. . 

It is important to remember that, while dealing with collections 
of data, three sources-of variation are operational: one comes from 
the interaction of an ideal stimulus with an ideal test unit (ideal 
patient in the example above); another arises from the difference ■■; 
between the ideal stimulus and the. real' one; while the third stems 
from the difference between the ideal and actual substrates. As long 
as we are unable t6 define likeness or we persist ii\ calling known 
unlikes as likes, we shall be faced with sources of variation (stimu- 
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ills and substrate) influencing what we measure. The goal -of sta- 
tistics is to aid in separating outcomes due to differences among 
substrates from outcomes elicited by the stimulus. Automatic com- 
putation can aid significantly in meeting this goal. 



HYPOTH^ES AND TESTING 

Suppose we take a sample of 100 patients who are representative 
of a particular population, give half of them a new antihypertensive 
drug, and give the other half a water pill. We measure the blood 
pressures of each group after one month of treatment and find the 
spectrum of results shown in Figure 2. Now we repeat the experi- 
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. ?. Fig. 2. Hypothetical results of a drug test on 100 patients**^ 
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ment on another group of 100 patients and find the results shown 
in Figure 3. Several observations can be. made.. At first sight, all the 
histograms are different. We expect this since we did not study like 
patients to begin' with. The second ^.observation is that the histo- 
gram of the treated patients seems. to show^ more patients below a 
critical pressure, p c , than the histogram of untreated patients., We 
would therefore conclude that the drug interaction with patients 
results in a lowering of blood pressure/ 

We are worried, though, .that there may be sufficient structural 
differences between patients in 'the two treated groups that our 
conclusion is in error (i.e., simply wrong). In ordertp estimate the 




No. of Patients | 
Wi th Response j 



I VI 



■J--: i 



(Treated) 



t 



I 



I, 



Pc 



pressure 



No. of Patients | 
With Response • j 




. (Water) 



'.. 350 -\ ' -'y^^ C ' C F. Starmer * r • - *\ : \ ". 

•-•V ' r -•" ^' • Vi •' ' ■ -V 5 V .■ 

differences. These models form part of the theoretical underpinning 
of /statistical data - analysis, ^nd their understanding leads' to an 
awareness of the limitations' of various procedures. 
To begin with, several * terms must be definecj^so that our 
: language parallels that of traditional statistics. A primitive "event" 
V is the outcome of an experiment or some activity, and the "sample 
spac£".S, is the set of primitive events representing all, possible 
outcomes of an experiment or activity. A compound event is one . 
.expressed as the union of primitive events. A "probability measure" 
then;-is a real-valued function defined on a sample space S such; 
. that:; - \ . ." ' . . ' ' 

: 0 = p(e) = 1 for every event e in the sample space; 
V; : ' p(S)-l; . ■ . • : '-c 

1 p(e t u tj • • ) = P(ei) + p(e 2 ) + • ' ' for every - ?i 

sequence of disjoint events e x , e 2 , . .. ; , 
two events, a and b, are; independent if ?j " 

; " v p(a rand b) = p(a)p(b). ? ■ \ 

The probability; p(e,), /of the event ej usually is perceived as the 
expected frequency^ with which event, e r , pecurs. A function whose/ 
value is a real ihumbef associated with each primitive event in the 
/sample space is called a randpm variable. "A discrete probability 
distribution is ;a- function associating each possible value that a 
random variable can take on with the corresponding probability. 

We iihali simplifyithe bipod pressure experiment inVotde&to 
demonstrate how probability, lised to assess the uncertainty associ-| 
V. ated with our calling unlikes as likes, is applied in the analysis of; 
^''^periiiheriial results/ ; Ihstead ; Of recording pressures after faking the 
£ medication, we simply record whether or not the pressure propped; 
>in a treated group of 40 patients! We find the following results: : v £ 

" V.' * . '., no change - . ..r-f % ■ '■ V " 

■A, ' ■»!".- : *. ■ '' dropped or increased^ ■ t r f ; J. ..' : -n* •• 

• ' ! \\ :\u ^ /:] ■ : y V 

" ' ■ >.. ! \ ■ • ^ 

\. - ■' ■ ?'V" «• ■ -■ i*. - . ; ; - • 

Thus, the observed probabilities of ajdwered pressure and an to 
>)>- changed or increased pressure ^was \l and .3, respectively. If tn| 
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study wefe repeated again and again; we might find observed prob-' 
abilities of .80 and .20 or- .60 and .40, etc. Can we develop a model 
that might indicate how, much i variation to e£j>ecf given a hypo- 
thesis or model describing source signals (blood pressure)? Our 
model might describe varioiis degrees of gxpected bloqd pressure 
^ change and we cbuld spend a long time ^ValUating.each model. We 
v could circumvent tHis^proloriged eyaiu^tiqtf;"by -assuming a model 
that states the drug did nothing and that^therefore any variation in 
(outcomes was due solely to patient differences. Sindejhere is only 
one hypothesis to evaluate, i.e., nc^4rug_eflfeGtV the labor spent in 
testjng models is reduced dramatically. This jnriodel (no change 
mo^el), of coursejis the famitiar 'Vulj hfpothesis.^fa 
stated it Is [?. ' v * -f : C . , n . • 

which states that the probability qf a change in. blood pressure, 
should bfc 1/2 if the 'drug. had na effect. On theSaverage, half the . 
patients ^should experienced idecrease while the^other half should 
experience no change o t r an increase: i ^ J>/.* 

~ Since We are^oingitb^deal with qqarititativie assessment of the 
hypothesis, we have to assign, a numerical yalue to the various^ 
outcome events. Let I be an indicator random Variable that has the 
value p. wherf thq blood pressure drops and v 1 when the blood 
pressure rises or regains, unchanged. Let us, define the expected 

"Value of a raridom variable, £(X)^Jf/& denote, byXj the value of X 
associated vfith the ith eyent in'ttie sample space. The expected 
v^|ue of X is • , >- . 

■'■'J ' «f(X) = IX iP (X i )- r. 

■ ;. "'! ' }' :: '• '. 
where, p{X|) is the probability associated with the. event represented 
> by. the yaHie Xj. We then define the expected variation of a randbin 
variable, X, as C ; :r: *' ~'\ \ : 

^ v . Var<X) = S(X - S{X)f = £ ( X i - ^) 2 P&iK ;y ° ^ 

•V. •': =/(x 2 ) - f(X). .' ' 

The expected^yalue of a sum of random variables, Yj , is 
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The variance of irsum of independent random^kriabjes is *. 
?y : ' • VaitlajYO pE af Var(Yjj. 

Let p be the probability when the indicator variable is 1. The 
expected value of our indicator function is then: 

■ /; ; ^i) = r- p + o - (i - p) = p . 

and its variance is *• ■ - 

/V. \ * var(i) = p-p 2 = p(i~p): 

Thus, for our experiment the null hypothesis would produce an 
expected value of p or 1/2 and a variance\)fi(l — 1/2) = 1/4 = .25. 
Since we are counting events, wfe define a new random variable, B, 
as the number of patietfts experiencing a blood pressure drop; so 

10 .* 

b= E i,. 

Therefore, <f(B) = <f(Z I) = np = 10(1/2) = 5 

..- ■ Var(B) = Var(E I) = np(l - p) = 10(1/4) = 2.5 

V; Var(B/n) =j(l/n 2 ) Var(B) = (l/n)p(l - p) = pq/n = .25/10 

; ; \ > = .025. , 

The standard deviation is defined as 

Sd^VVar, tT 

so ■ ° 

£ - . Sd = v/!025 = .16. 

The issue r before us now is^ whether ..the fraction of patient£ gxperi- 
? *fencing a pressure drop is consistent^* with observations 
^cfiaracterizing 10 patients from'a population of untreated patients 
(the population defined by the null hypothesis). For this we need to 
4 know the likelihood of getting results as extreme- as the .7 we 
observed, given a no effect model (an untreated population). This 
reduces to a problem in combinatorics: Assuming two equally 
likely outcomes, what are thcprobabilities associated with all pos- 
sible partitions of 10 outcomes (0 decrease, 10 no change or in^ 
crease; 1 decrease, 9 jao change or increase, . . .). Let i be the number 
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*•■ * .■ . . 

of patients experiencing a decrease in pressure; then the probability 
of finding i patients out of n, P(i), is 

The results are summarized in Table 1. 

ft ' . St 

Table 1 r . 
Expected pressure drops for patients assuming no dTug effect. 



Number of 
patients 
with, 

a reduced 
pressure - # 


■f % 

Number- 
of 

partitions 


J 

Probability 


: \ 


i = 0 


1 


1/1024 


as 

.000977 




1 


10 


10/1024 


.009765 




2 


45 


45/1024 


.043945 




3 


120. 


120/1024 


.117188 




4 


210 


210/1024 


\ .205078 




5 


252 


252/1024 


.246094 




6 . 


210 


210/1024 


.205078 




7 


120 


120/1024 


.117188 




.8. 


. 45 


45/1024 


.043945 1 




9 


10 


10/1024 


.009765 } 


.171875 


10 


1 


1/1024 


.000977 J 


* 



From this discrete probability distribution we see that the likeli- 
hood of finding 7 or more patients out of a group of 10 who 
experienced a drop in blood pressure when untreated was .171875. 

It is helpful to recognize that this observation is the result of 
dealing with a theoretical population of. heterogeneous or unlike 
patients. The model allows us to project the results \of experi- 
menting on the "null" or untreated population, and then to com- 
pare these results'with those obtained from an actual experiment. • 
v For\this particular example, we are left with 'a final decision :*T)id 
the drug do something or not? Therefore, we must draw a line 
somewhiere ^ , that observations falling on one side of the line 
suggest a drug effect 2nd observations falling jui the other sijgge^t 
none. In so doing we jefbw we shall make mistakes from time to 



354 



C. F. Stanner -: 



^''tiSE^^caXling, drug^.-cflTcctiyiB that, are hot (type .l err<fy).,ahd'C^ 
. effefctive} drugs iri#$^^ 
this, pro olem so lorig as we deal with samples fr^^ 




I] 



s present 



■v. •the; niUI hypothes^ m^^&erye 

at leaSi 8 patients Nwt^ 
£ therefore, we conclude (peirhaps : in error) tlj\at idr 
: " The r^xle played 6y • sartipli ng in the example c 

/ : eihj>ta^^ 

tients fifom a heteroj^'eouiSP group or population, an icj 
wtreate^fof' fault^interpretatjjbh^.of data; The simultanel 
orheterogengity :atTfcfng both patients arid the^tre^^ent aiwaj 
; /presents ai l jJiiellin^a. Wha f t is responsible for the obsertf®fc 
Eitiier jhe heterogerfeity or the treatment 
There are only t^o guaranteed wa$s v to; resolve thj s ; cd jji^^t ^ n^ is ^ 
w:to solve the" heterogeneity problem by; precise, matching r'^p^tient^ 

^ ------ — — 

utinfi 



Thisvis. no^t attainable with current t&hiMogy.. The 
: however, is attainab deal 
';■ / ? :bntirj2 ifpopulattipn, Efficienf Bata coilecfton^iigh speed to: 
"inexpensive mass^tbrag^^at^l highly ;^il[&tive. dat^o^f^ilShti^ 
'techniques have n>#de thjsl option a feds^le one, whe&a^l.O ye&fsf 
' #gp it fy^s ^ wprKj^n samplin^ad ^ppu| 

^ such an appror^ u 

• bf rthe question^ 

%• lec!i6it f techniques . wm tpty'ard iempvm 



V due to * >sampHhg. ;Th effect :df compu^i^ 

.^•'^'/•TJKe/punp life^been ^£-4^^ 



s^mV'jpf ffie^ 



Underlying concepts cqjptral to.tfie ^xp6rimen tel^|jre^ b$?In« per/ 




c ;CQjitrol set". A :nuirhypoth^ 

vk ftnip^ treatment effect Tfte null 

|§;hypbifiesis; : is 'Stated in : t^^sP^of - ait^i^tsrt^t ical ^" model ^ich *pfpvid?sv 
us jVith what we would, expejpt frtfm studyujg * "fl8£. treatment" , 
intervention. Such } Studies a^edhanced by exploiting cpmputing 
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relsources tg examine ob^ervedjdata iii their full' r to an 

extetit impossible heretofore. Thtis, for .example, generation and \ 

scrutiny, of histograms derived from T ia "fl^ treatment":- population 

thelps siet valuable guidelines for accepting or rejecting a hypothesis/,, • 

HISTOGRAMS AND f Heir; characterization r V' . " 

■ v, ' * '■: ,,. ■' '■" V! • . ; . 'ft r ; ■ ■' * :f :v V . ' / 

in) our experiments, we observe a property of the. experimental 
unit or substrate and a sequence; of n observations' (ti 9 1 ? , ., .^t n ) is^&Z 
Recorded* Under appropriate conditions the spectrum of observed^' 
values may be thought of as n samples of a single random Variable, 
x^, havhi£a given probability distribution. T^he variation ipbserved 
^uy Appling Xi arises from two sources : ; the heterogeneous nature 
of experimental substrates, and the interaction between-an in- 
tervention or treatment and tfw substrates.^ 9 r " 
• The essence bf hypothesis resting, as we 'saw earlier, lies in deter^ 
mining whether or not two or more random variables, display the 
safee behavior. This behavior is, monitored by sampling the 
random variable. A histogram; the cornerstone, of many : staiistical . 
procedures, is one . way of displacing the random variable's* 
behavior.'; ;.- .v ••• ' .;>■•...'*•-■ -\ 

Histograms are displays "of the; frequencies of values of ^e,'*. 
random , variable. Thus, for. oiir earlief example, -the hi^ogra'ijil^ 
would appear as shown in F^ure 4. We have-no difficulty in re- - 
membering the features of this particular histogram. However, had 
w.e chosen4b record intervals of changes in blood ^ pressured 



'. f-c,,*.: - .1 >^ • ; . ...... 

V-V b ' > : . • frequency . I . I \ 



i iv 



Cy'-tt.r' l.-'r • "iV.'v . v value .. . • .; 

Fjg« 4, Histogram gjpressure effects (0 = pressure drop4 
. Q . ^" . 1 .= ^priessure increase' or nQ^chsingS);^.. - ^ 
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of simply the direction of change, oun#histogr£im mrght appear as in 

Clearly^ as the number of observed values increases, the com- 
plexity of the histogram becomes, unmanageable. Thus, unless 
find a convenient characterization ^equatiqn)' that describes this 
frequency-pressure relationship T or have acqess to high-speed com- 
puting, we shall h&Ve difficulty In/compaphg this histogram with 
. ^thers. The amount of detaij, in 'a histogram increases with the; 
. Cardinality of the sample space* thereby /making manual treatment 
difficult. Therefore? histogram? have given rise to a number of "de- 
scriptive" statistics; used to rep.reseht certain features of the h^to- 
,vgraift Hypothesis testing, then, ^approached indirectly by com- 
' paririg thes^ descriptive statisti^fratead of comparing the histo-. 
gfams from which -they were'deriyed.. v v ._„;^ y ; 
w It is u&ful to speculate about approaches to histogram compari- 
son;, given' high-speed computing. 'Information is lost when .using 
descriptive statistics to characterize histograms. Means and vari- 
v tinces, the "most popular descriptive^terms,; characterize only the 
!|Tgrbss ; ^pe of the histogram. Techniques for direct comparison of, 
histog^^ lost through use ; ,0f these terms. 

AHbwe^r, -such* comparison required more ''Computation siich *as 
^orti hg^the data, and computing the cumulaUye frequency TutiGtiqh.* 
(The Kolmogprov-Sihirnov two-sample test ^Jif is an example 'of 
^ one such test, This test compares tl^e cumulative ^ 
f butipns arising .from different experimental settings.) Kor large* diata, 



# ; j^equency .J {<s 



V- ,1 - • ! I I I 11-^:.' — — 
I I I Till .1; I B \ ; 

" I I Y I I II II — I I Tl ;• -. : \, 

■si j<: 1 1 .1 1 1 i : 1 i ? : ;.hri . i -i 1 1 ; :,: 

v amount v offelobd pressure change 

Fig. 5. Jlistograrc^ * 



collections (in excess ot.50 data pdints)/thfi ipanual coi®putafic>na|; ■ ■ 
'eflcrft r eq in re c b tVdi& -y r — 

computing removes this constraint^/'':'.- . ? ^. ' c ' \ i>. 

* -The .median, mode; ^kcw, kurtosis, etai Mfcj^ other indices isspci- v > 
atcd ; ytftfo - various features ef /a • .h^tbg^^^^Miich of Jiypothesisi y 
■.testing is based- bh comparing; pn^br ; ^^p^f the:se features: IBe-^i • 
caijisc all the information c^pture^ 
ted in thes^ 

source: to the: dait^ making a, V . 

type 4- or type 2 mistake f^byei^ tfi^ feptu^da.l^\e intercst--;t v 
jng properties that produce s^etal^itorgivljri^ 

anQ all$ty us 4o estimate parameters embeddfcd r in r j?rpbabilityi/3(is-- " ' • 
■tributjoiis that are. aissoCiat^ \:: 

" Earlier^ 1 il^was stated that v an axperijti^rtl cajf be viewed as sam- 



pling a randpm varmble .\t certain cotidrtibhs are ftet. If sampling is 
suehtfiat v ^- - " " ' 




mate tr 2 (x) with s 2 where .^j* ^ f^g y , ^ • 
r > / A • :, " s 2> = I'l'^M ^Bl - ^ ' ' f . - 4 | 
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: v Again, ,s 2 ■ like X 1 can bef used reliably to. estimate pr 

random variable (andrthvisf a histogram) ind 
l^ibjiity distribution. r 1 '■■ ^■^ 

vjJJ^ To summarize, Aye assess the interactions between freiJp 
^^npuli and experiinehtal substratp^y, observing a property of the 
^IfosfVi^ is represented by a. random, variable that. 

yj^^^ value for. each diffcrenf experimental oiStccJmc. 

^Si^WfP^P 1 " 1 * 0 ^ ^ e random variable are monitored by cbltecting , 

: ^S^^atuating ^ random sample, t and thp fe&turtfs of the sample arc y 
;.' pbttr^d by a ' histogram, t^y^thesis ' toting is based on . com-- 
1 paring random variables suggested by, competing views of an e*r 
: '- : * periment This caparison is carried but \by"determinihg whether 
the competing random variables are equivalent Because histo- 
s\ grinis: have more features than. -we dan' coixifortablys handle, .they 
[ jtre' usually represented by means and variances. TKe nulI-hypptK-; 

'-Vcsis, -- therefore, is reduced' to equivalence* of means and variances 
K^Wd the hypothesis test usually is established by comparing these^^ 

'"means and variances. It ..is not' outl^^dishlQ chajactenze this ap- 
' • pr oach as being prompted by *h£ Necessity of ^ivii)gMn a -world 

* without computers. The anachronism is clear. ^ 

J 1 . - # ...TV. V- - v • 

; j" j ,;>;.: : '/ti : : - '•' 1 ""' •■- ■■•^ 

,Whpn estimating propertied of a random variable, from v a random 
• ; ; sample, it; is helprfuT'tp knoW ^hether or not the estim 
. |An unfrased'VstimatQr, one for^ whieh its f^pp(^\^^^ 
/ equal to the parameter* Mhat^it^stimates. : , > :% y -yC "0 

P thfe^sampje^mean estiiffiatfe the; expected vkl ue 6f 4?r^dom-va^ 
^ i^Ble in ; an !unbjai^ of * 2 'if.;, 

biased; rnst^of' ^ ?i?rsy,\}*} . « 

• the expected value for s* is % > >• v t ' '■■y 
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husj for an unbiased estimator of a 2 we need to have . , .. 

This necessity," of course, is well established and it is accommodated 
routinely in computational procedures. 



STANDARDIZED VAlUABpST' 

It ;is intefastji^tcyiote that histograms resulting from sums of 
experim^fyaljibserva^ almost the same when expressed 

n\ tKfe appropriate units. Consider the sum of the points of several 
i dicej An experiment ^n this context) consists of a number of throws 
, of n\Hce> After each ihrow, the points on the faces are summed and 
a J^i^ of each sum is tabulated. Plotting the 

-^t^jVi^requencies o/eaclr sum gives the results of Figure 6. As the 



^ : v.pw \ i K i 4 p.i \ ... 

^ \ > v. r L* I I H i 

• v : . >. : l . . p : i l y I -l ! v ; 

' ■ v v />" ■ 4 



I 



- pi h v -v. - [ #f^^ 



l-'v< 



r 



V • -i al . 



r. 



^ t^" • '•\'.;V '•--2-'" : - '"4V ; '^5-" 8 10^ 12 '14 16 18 20 ' 
i V^r V' r 1 ^- 6. f Histograms ^ siim^if points A fo v r n thrgwftr djce, P, . 
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number of terms in the sum increases, the histograms get wider and 
■ the peak moves to the tight/ ^ je the sum of n dice. Then its 
expected value is • 



1 



« • ••:.!•■ '! £ 6 -/7 7 1 
<?(S.) = E i P. = 7 -S-i.-rg— 2 - 



The variance jasspciated with S t is 4 , ' 

•vi;f;J: : ? '■' • • Var^)- ^(Sf) ~ cf 2 (SJ 



Mm 



Sincejthe n dice are independent, the variance of S n is * .* 

I'Yt- *■ - • ■*■'■«•. v- -- ■ 3 5hj ** 

ft". ? V ? i^r(S„) = VartS, + S, + ••.• + &-/rz* 

and the standard deviation, then'. is 1 ••• : . 



tff- "^J^n •••• 



From these relationships, we^see that?p^jh^an i l^P^rianc^9^ 
the S n is dependent on the number of tefms in Xb$ su ^fife-.v^^ 

^ For putposes of histogram comparison, it is helpfti " 
grams are normalized such that the peak amplitude and variance/; 
are independent of n., Two parameters, need adjusting: the mean 

• and the variaric^; The mean can be shifted to th^rigin by sub- 
tracting from each, sum the expected value pf t^stjm, while': thp 
width ofthehi^ 

(sum -i- <?(siim)^^ the^standard deviation ofMe sum;fTHus we 



. V COMPU^ 

form ai$c\v random variable, z, where r * v ' 

■ ^ , v Sd(Sj. ..■ •• / 

The propeftifes of z can be established by computing its expected 
Value and its variance: ' .. v , \ ""^yV 

^ \m >( s n)-^ s 3) ' 4 : 

■•. ■.. . ... Sd(Sij) 4 ... ... ...... 

, • " •• Var(z) = Var S " ~ <?(S " ) = 1 >r " 

■■ • .*" ■ ■ _ ^ - •.: - .... ■ 

This normalisation works' for any random variable, not just sums 
. of random variables. Therefore,, afl rattdom'^ariable^of the.form 1 

: \. .'"."-if. ' x.-4p&> • 

•• ••• ,>.• .v. ... : ^ v . Sd(x) _ ■ ■ . 1 

\Will^hayQ expected., value = 0 rand variance or standard; devi- • , 
ation "1; Without extensive high-spee3 computatfonal facilities; A 
aheseradjustments oft^ to : 

preclude their Use. That deterrent is gone." " . * 

■ \ ^tien dealing specifically ^vy|th sums orrandom variables, their 
histograms approach the form normal curve as rt increases.. 
This is ; generally .observed regardless of the underlying probability 
ijis^tribu the behavior of a random v 

:yi^ variance a 2 , is defined (Figure 7) by : , 




v - N(x ; n, a 2 ) = 


r i (x-//) 2 i 




- ; • 2 - .a 2 J 



c£ Sell 
orem: 



; the .observation that' sums of randqjm variables prbduej 
^hapgd histograms 4s ^ tStbd jgx p 1 ic i 1 1 y in thexentral limit theo 

Let x, , i as 'A, 2, , n, be independently distributed rajidom vari 
ables with the sanid probability distribution having ah expected 
value ^ (x). and a finite variance a 2 : Then 
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M(x) 




. ^ - ,x— ■ 

Fio. 7. Normal distribution. y 

is asymptotically normally distributed with mean of 0 and a van- y 
ance of 1,N(0, 1). ' ' . ' ' . ; J* 

The importance of thinheorem lies in theTact that it provides a, 
very robust (i.e., distribution-free) test for comparing histograms 
when one corisiders only ^the centroid of the histogram as being ' 
important. The mean, being a sum of random variables, will be 
asymptotically normally distributed. A couple of examples here will 
■.help demonstrate this point 

Example 1. We have two vdrugs we wish to evaluate: one is a 
placebo ^and Tth^othef is^an antihypertensive drug for lowering 
blood preS!^ whether the pressure 

obtainetf'af^ than it was prior to 

administering the dru^^^esii^ displayed in Figure 8. N A 
patient^wefe~stad^ while N B were studied-witkdrug- 



8 Unchanged 
Lowered or Raised 



i 



Drug A >^.vl I I NA *■ a + 

L . a | . b I 



Drug- B vl.* | I NB - c + d , 

. ■ — -i'Sr.-.*- N m a + b + c -fc^^ 

>; : ■ : • ;,-%«v. x /fJ, " 7 ;;.C • : - ; . • • he* 

h . 7. ^i.d. 8. Resutfs of a drug comparison experiment. . • \ 





pe^p :; pressure in the ith ^ 

f y patfent^ U , r^jjpnse to drug 

wA = response to ting will be a function 

^ofihe two average resi>on&£ r and /i B /Vtei estiipate ju A from 



and'a B from 
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The hypothesis \yc test is that — = 0, From the definition 
of the i test, we set v%; "- : -, ; - : ; " r ' : ; 



z (X A -'X s r 
" Sd(X A - x„v ' . 



The variance 6f(X A - X„) = Yar(X A ) + (Var(X„).'Sincc the null hy- 
pothesis is assumed true, we have • 

l>. Var(X A ) + Var(X„) = Var(X)Q- + , > 

N A " + N n _ 2 ' . ; ^ 1 

The z statistic, derived in this manner, is equivalent to the t statistic 
used for comparing'the means of two populations. It is interesting 
to note that the critical value ofz for rejecting, the null hypotn&sis^ 
at the .05 level of significance is ± 1,96.. This is the asymptotic value: 
t approaches a§ the sample sizes increase. v 

From these examples one can sec how to design a test for many 
hypotheses wh^n a mean or other linear function of the: data rep- 
resents the information content of a histogram.- As a first approxi- 
mation' for developing hypothesis tests, the z statistic is not a bad 
choice. One is required to specify the underlying data model so that 
expectcd'means and variances can be evaluated. Critical Valuesj for 
•accepting eft rejecting the hypothesis are then derived froiji tables-' 
of N(0, 1). The increased analytjfal facilities provided by the pro- 
cess of standardizing variables certainly are beyond, dispute. How- 
ever, they no longer are something Jto be wished for. Rather, they 
arc something to be expected; ; ^ ! ^ 

;Wc'haye neglected to di|gftss notions of power/sample size esti- 
mation, and type 2 errors: .topics relate directly to the obser- 
vation Ithat the; variance of a mean decreases^ as n, the number of 
? obseryations^ncreases,' Many useful exercises exploring Ap^^conr 
cepts cat^^ asymptotic apprp^ch used 

above! '■ -'''M : "' v ^ . " i, . ' ' ■ - . vs-JK' * ■ '. < 
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RANDOMIZATION triST^ 

Although z tests cun be used in many common experimental 
settings; there arc maiiy otjicr tests that do not rcqyirC the invoca- 
tion of asymptotic properties. RandomUutiori tests, bftscd. on enu- 
meration of all possible partitions of ^e.quteomtfs, constitute sitch 
a class of tests, In particular, easy access to computing casts a 
different light on the; practicality of these tests which were devel- 
oped by Fisher [2], These tests assume that the observed samples 
are from a "null" population. By ignoring the original partition (by 
treatment or intervention) of the observations, and investigating all 
possible partitions o r f the data, it is possible to o&cribe a "local" 
probability Jdistribution/attributed to a'.Specific "null" data gener- 
ator. The original observed data partition is then compared to this 
distribution and'an assessment made as to whether the data are 
consistent with the "null" data. These tests were little uspd because 
of the computational load ^required ? to permute the data. High- 
speed computing,, however/ makes a certain class of "these/: testi 
feasible,' thereby placing thqir power* and versatility well within 
reach/ ■ ' - r- ;.; r ! 

Consider the case oftvvo drugs again. A study oftwo groups of 5 
patients might appear as shown in Figure 9. We assume that-;the i 
null hypothesis. is true, i.e., Drug A responses are no different from 
Drug B responses. OFurthermorc^wd assumei tHat the proportion of 
( — ) responses to ( + ) responses reflects the underlying heterogen- 
eity of the experimental substrates (patients) and that the number 
of patients subjected to each drug was predetermined. The "null" 
probability distribution, then, is derived by enumerating, all data 

K . y (f t ' • • " •'■ 

'<* .' : t 1 ■ . ' ' ' 

<v . " Response 9 Total 
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configurations pdssiblc fq^fixcd marginal* totals (5 and 5 in this 

Case). 1 " ' \wr-ttt?--''^ • ;■■ \ ! ' A 

• . The number of, ways of partitioning 10 patients uMq two groups 
of 5 .is, ft 0 ) '.m 252, each grouping bcjjjg equally likely. The number , 
of ways of partitioning the (+) lrcspi>nscs into two groups, one of 
size a 1 is' The number of ways of partitioning thc (-) responses 
into two grodps, one of size b is ({j). Thus; the number of ways to 
get a responses 'afttyta^ 5 ^j* responses is 



iSince' ther^arc (^j 0 ) i)ossiblc # partiti,ons of tl\c data, the P ro W»^ 
of getf^'^^J'Tesponses.iaiid a ("+') responses where b « ^f^R^t 

(4V 6 ' " 



, , ;a'A5 - a\ 
Pta)* 



(?) * 



For a general table as shojvn in Figure 10 . (where a ^is the smallest 
.integer among a, b,c, and dj, we have V 




Ya+.cVbH- d\ 
I V a A b ) 
. /a + b+V+dY 



tabulatiojsfof the probability distribution then requires adjusting 
a, b, c, jimd d such that a*h b = n A , c + d = n B , a + c = l lf and 



Total. 



Drug A |'« ' . T I nA 

I a I b |«. . 



Drug B | ' I J I nB 

I / c I d' I 

.f ^ — — • . r 

Total 4 11 / • 12^1 

Fio. 10. Generalization of the t&o^drdg test, 
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b -f d = 1 2 ,« while a ranges from 0 to min^, n^). Thus,^the % disyi- 
butioh . associated with a 2 x r 2 *table; having marginal totals or5- 
i ah(l .6, 4 is* glvep in Table 2. /The probability of observing A or 
fewer ( + ) responses with drug A is .023&\+ .23810 = ' .26 1 9o/ fThe 
results of our experiment^ therefore, are consistent with taking two, 
samples of 5 from the same population, i.e., no drug effect. / x _ - v 

Evaluating the probability, distribution can be simplified con- 
siderably b'y : expanding p(a) in terms of factorials. This leads to an 
efficient algorithm for assessing the probability of a single table. 
.The; additional elements of the probability distribution^can theq be, 
depVed iteratively by multiplying and dividing this/quantity by 
functions -of a, b, c, d,*Sn^ a variable x which is incremented or 
decremented By one for each iteration Q3]. A|?iin,^thi^ is discour- 
aging Tor computationaT resources Hijiitied toimanual means, but 
eminently feasible for integration as part of a abrnputer-based ana- 
lytical environment., a/ '.v.iV')". 
.?As a second example we wish to determine^ whether, diet A is 
different from diet B. We have two groups or patients with n^ pf 
them on diet A, and n^ on diet B. The weighLchange for each 
patient is measured based on the Weights before and after difet. 

The hull hypothesis for this particular/study is that/diet A is 
equivalent to. diet B.-vas reflecfed by the weight changes. If the diets 
are/equivalent, then the n A + n B patients could be considered to 
have come from the same population. ^hus, the association of li^ 
weight changes with diet A and n B weight changes with diet B is 
totally arbitrary. Following the last^ample^ we have (; n ** nB ) pos-' 
sible partitions for "labeling" patients with diet A or diet B, 

For each partition we compute Jthe average weight changes for 



Ta'ble 2 



•0/ 



Probability distnUution of a 2 x 2 table 

with marginal totals of 5, 5 and 6 r 4. 

/ / 




/ P(m) 


p(i^m) 


m = 0 . 


' :02381 


. .02381 




.'23810 


.26190 




-747619 


.73809 




.23809 


.97619 


■4 


v. :02381 J - 


l.oo: - 
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patients labeled diet iVand diet B,f We order ,the differences of 
average weight changes and constru^ ;a>histogfam (Figure 11), 
Frond this histogram, we determine whether the original; weight 
difference associated with those^ patients treated with diets A and B* 
is consistent with ;the*nrull probability distribution.' . / 

Actually the complete his^gra^ need not be. constructed. We 
only need the tails of the hisloJ^m^Tfte portion required^ depen- 
dent on the amount of type 1 error .we are willing to tolerate. Thus, 
if we set the type 1 error rate^to 5%,: then Wff only need compute 
the .05 ( ftA n A ni ) most extreme data configurktioife^ . W ^ .( 

By sorting the data theseJ^on figuration can easily be found. Let 
dj be the ith observed weight change and r, the ith rank ordered 
weight change. f ' ' * 



natural order: 
rank brder: 



d l d 2 ••• dn.d ni;i d„ AtJ •yk+n. 



r A = 



• • • 



■ r n A (n A + J , *n A V,' 



+ n. 



r B = 



£i-n A Vr 



Tfie jnosf extreme data configuration leads Jo a weight change , 
difference of r A — r B . Less extreme configurations can Jbe computed 
by interchanging r„ A with rj, AM , etc, In thisJjpanjier the^ytremes of 
the histogram can be directly computed. This is a powerful yet • 
simple algorithnywhofce routine use is made, feasible by high-sf>eed 
computers and techniques developed for ^them. 3 



number of I 
partitions I 

■■•■-/: ' i 



; i 

i i 

r i 

i ■■ i 

i i 



M 
I 

I .1 

i : r 



mean difference in weight change 



Fig. liryistogram^or diet experiment. 



• . 'tW J^Ajrftiogproy-Sntirnov test represents another vehicle, for; 
comparing two random^ varialjlcs.* The test is based on comparing 
the observed cumulative* frequency distributions associated with the 
two randorix'variablcs in question. As such, Xhe' test is sensitive to 
anything (Via t ^modifies the distribution function, including^changes 
•in means: and Variances. If two sanjples have been drawn from the 
same population; th?n one would expect the sample based cumu- 
lative distrittutions' to be' fairly close to each other, since the oftly 
,diffQrences yroulci arise from baselinrf heterogeneities. When the dis- 
tance between the two cumulative dj3tributians is top. great at any 
particular ppiritt then the. possibility of "differences between the two 
random' variables mli$i b& considered. \ . ( ■ . 

The Kolmogorov-Smirndy test is based on the largest djfference 
between the cumulative distriputions associated with two random 
variables. Let C(\{} be ihe pbserved cumulative frequency for the 
ith value of random variable x and let C(y { ) tie the observed cumu- 
lative frequency for the ith value of randorti variable y. ^et n x be 
the tota) number pf observations of X and n y b0 the total number 
" of y. Then let" ."^ 1 \ 



4 



■ ■ *v • ' . Dj = max 



(C{x { ) C(y\)\ 



be the* maximum "difference in' tl\e 'observed cumulative ^distri- 
butions and r , V 



D 2 V = max 



be the absolute maximum difference Jj?et>yeen; the two observed 
cumulative {jisfobutions. - > 

; This test is "not widely used tegause of the^effbrt required to 
tabulate the 'Cumulative frequeHcjr^Jtribution ifoC large collections* 
of data. However^ high-speed qompiitjng minimizes the difficulty in' 
sorting the data and hence estimating the cumiJjative distribution, 
' Like the randomization tests r tliis ^t makes tlB assumptions about 
underlying' probability .distributions, and therefore provides an 
unencumbered>view of^ experimentally derived da^a. " The op- 
portunity to reniiove such encumbrances constitutes perhaps the 
most prifoutidVffect of computer science on statistical work. 
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t 

MODELS OF DATA SOURCES 



Thus far we have looked at simple data source's where the experi- 
mental substrates, though heterogeneous^ were considered equival- 
ent. When we cah ; characterize the heterogeneity, th^n thqj^ exists a 
class of models that pari be used to improve our ability to test 
hypotheses. Let x (l xm) be" a XI x m) vector of attribute valiies that 
describes the properties of a single experimental substrate. Since, 
most experiments compare two or more interventions, treatments, 
or stimuli; we include these in piir definition "of properties. The 
response we observe as an outcome of an experiment, th$n, is some 
function of these properties, i.e., V , . • , 

To a -first approximation, the response outcome y c&n be rep\; 
resented' by a linear function of the substrate ppoperties ^nii inter- 
vention ■. . >• ; C . [ '; 

' '. y = xA ' : "- 

where A is an (m x 1)' vector pf model parameters. We rarely are 
able to de^l with all properties contributing to the heterogeneity r of 
thd* experiment; so we include a- noise source, 6. The observed 
outcome, y, then is expressed as > • . * 

■ . y : <, v y =,xA + e. if 

For an experiment involving n "independent substrates, we can' rep- 
* resent the data in matrix form as ; * 



y = X A + e 

(nxl) (nxm) '(mxl) (n x 1) 



£ where 




r 



One method of estimating the, parameters of the models A, is to 
choose the value of .A that ipinfmizes the squared difference be- 



I . i M • 



twcen the prpdjeted responses, and the observed responses, V, or 

"■■ . - J'- ■• • ;min ^ Y)'(t - Yy • 4?^.^^v,;^:^X^ 

( . ; : A .. . ■ ,-.*>■ ■ V • 

1 fc min (Y- XA)'(y -XA)/ VV': ' ' 
; ' . ;.. V./v-:./ \". V'""' '.* 

Thuyviyq can estimate A by standard technique: > 

(ipxl) (mxm) (m><n) (n.H 1) * 

The expected value of A is 

' <?(A) = <y[(X'X)- I X'Y] o (X'Xr'X'^V) ; 
& . - - 1 = (X'X) " 'X'XA m A ' • ! ;.- ; - ••'••„,'/'/ 

. and its variance is • • ^ '- 

I Var<A) = Var[(X'X)-; l X"?] = (X'Xr'X'XfX'X)- 1 Var(t)' 
uk";fei ^(X'X)" 1 Var(t)/* I -1 



For experifnents where the variance is independent of the'slibstrate 
used,^e can let > r ' * : ■ /' ^ v . 

Var(V) = a 7 [ ■ • t 

J. ■ *v 

Note that the estimate of A/the modet parameter vector, is based 
on a ^linear combination of the outcome_jor response variables. . 
Thus, each modfel parameter .should have an asymptotic normal 
distribution and we can *test hypotheses using z statistics. J - 

"•• '* +'i ' j ' ", '•' ■ 

HYPOTHESIS TESTING » / > Jk 

t ;■'•/• . • - 

• , , . \ ■ ' ■ i 

J ' • . \ ' 

We frequently test ideas about the aj's Nvijfh hypotheses of the 
form i { — 0 6r a,"= aj, i & j; A general class.bf thesfe hypotheses can 
be described by a matrix product of.thefdrm 1 _ i 4 



.; c a U.ib ,^ • 

(sxm)(mx 1) j v (ixl) . \ 



» 1 
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I is an m 
mental model is 



where C is an iirbitrary matrix of rank s i m/ Suppose .the experi- 



.-7 < ' 



and the null hypothesis H 0 : ai » a 2 . Then, , . ';,.<♦ 

To test such v a genera) class ,'of hypotheses, we need to know what 
• contribution^ the fit'betwecn the data and 

the model. This can be;as^^cd by performing a constrained opti- 
mization of . r , * ^ '! •< »: 

' V' ; ' l-i >',■■. V XA)'(y. - XA) ; V 

■ l' ' y; ;';"l $ : i ' ' • ■ ' i V* " 0 •:- /B / ■ 
From this^e fin^Vhe mmimum to be j V 

where A = (X'X)"" 1 X'Y, th^nresYricted estimate of the parameter 
vector. TJhe term . ' 4 ° *' 

' ./ -; # (Y<-XA)'(Y-XA) 

is the squared error tiiie to the • unrestricted model while 
^'CTCfX'X)"^']^ 1 CA is the additional error due to° the re- 
strictions. ' ' 

.•Let . .. . ... ,, .. • ... 

• s„ ^A'C'cqjcxr^ci^cA 4 

,and ' \- .-. ■ * ^ 

s E = (Y X$)'(Y - XA) == Y'Y — Y'XA. ■ V 

Then, a ratio of average errors can be formed for testing hypo- 
theses. Thus, to test the hypothesis GA,= 0,n % 

■ ■: ' . % r Js H /rank (C) ' > / - 

where F*is the variance ratio with r#nk (C) and n - m degrees of 
freedom. Clearly, high values of F would suggest^the null hypoth- 
esis does not hold while low values of F (<4) are consistent with 
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the null hypothesis,. Note that we have 'made* no^ser^us distri- 
bution assumptions since intimates of parameters, A, are linear 
combinations of identically distributed random variables. 



IMPUMENTAflON 



' : Access to computing Considerably simplifies the management of 
/ this general linear .model. We shall look at the computational, rc- 
' qiiirements necessary for having such a general model at o\}f 

fingertips/ Expanding Jlhc geheral model V — XA, we obtain 




22 „ «V 




wher^each row represents data associated with a single substrate, 
taranteters of ^e model are estimated from * 



Th^r 



A .-JX'XJ^X'Y. 

':• '« I . ^ Uri^l) <m*m)<mM) • 

Expanding X'X r we find*! 

% i PsEa* i. X *l Z"-1 X 11 X I2 
(mx>., t 

r 

md expanding X'Yand Y'Y, we find 





v » sr. 1 y? 



X'Y 

(m x 1 ) 



* These two matrices can be accumulated incrementally . Starting 
with X'X = CTand X'Y ~ 0, each term, of each sum is computed 
based on an ith substrates characterization £y i9 ^ft, x i2 , ... , XjJ. 
Each element fof, the matricesvis then updated. By exploiting these 
data-structuring techniques, <jne can analyze mddels* independent 
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of n, the numbcf of substratps studied without retiring additional 
computer memory for the datq. The necessary computations for 
hypothesis testing are simple functions of these thrcjci matricps. 

To minimize the programming necessary to support the general 
linear model computations, it is hclpfijl to have processing modules 
for performing the following tasks: 

1. inversion of aninV x m) matrix A » B" 1 , , 

2. pre/post multiplication of a matrixjwith another matrix > 

, , , ' . C « A' Bf A 

* (sxi) ' (iKm)(mxm)(m><s) 

3. matrix multiplication C « B A . 

. i^K;, , ("»><•) (mxpXpXi) \ 

The following sequence* outlines .the computations necessary to 
evaluate a jjnear model: * 

* I Initialize Y'Y, Y'X, XX to zero, 
i. Accumulate Y'Y, Y'X, X'X for n data vectors. # 

3. Estimatbrwiodel parameter from A .» (X'xW'X'Y , 

1 , (mx 1) (mxm)/ Jmx 1) 

4. For each hypothesis, get contrast matrix C and form 

S '= 1 A' C <C (X'X)- j C C A v 

(I x m )( m x •)[_(• x m)(m x m)(m x g)J (i x m )(m x 1) 

. ; (note: two applications of pre/post multiply) 
S= Y'Y - Y'X A ^ 

(|x i) (i xmKmx 1) ! ' 

_ r _ S H /rank (C) 

S^n - m)* • > h 

The resulting significance level for. F(rank(C), n - m) can be looked 
up in tables or computed directly. Here again, standard computer 
science techniques make these implementations relatively straight- 
forward. * 



EXAMPLES 

Consider the drug problem-. We have two groups of patients of 
size n A and ri B . Group A is treated with drug A;*gr° u P B is treated 
with drug B. The null hypothesis states that drug A is equivalent to 



376" af. m StarmL% s f • 

• ■ » ■ i 

drug H. We measure the blood pressure to assess the drug-patient 

'interaction, One model fctr this problem is . • > - • /r 7 



where the independent variables .x n and x, 2 arc 0/1 variable, aj Ls 
\ the model pressure .assocKtfcd with drug A, a 2 is the model pressure 
. associated with drug B t and x ( is the blood pressure response of the 

ith patient, ^ct • * , 



\ 



\L x,,^ 1 if patient is givfcn drug A, 

V else 0; 

\ 

\ x, 2 =s 1 if patient is given drug B,' 

clseO. * 



\ 



The data^om an experiment then will appear as 




To test the eqqivalcncj^of the two drugs as reflected by the average 
responses, let 

, The F(l, n A + n 0 - 2) 'is equivalent to the square of the t* statistic 
> obtained from a standard Student's t test. Next, lfct us assume that 
the blood pressure response to the <4rug h dependent on the orig- 
inal blood pressure. (People Wh normal pressures will have a 
small response while people with high pressures will have a larger 
response.) Define a new model, \ 

* V 

y { - x n *i v + x l2 a 2 -V, x l3 a 3 + x l4 a 4 
. * \ . j 

wheje x u ^nd x l2 are 0/1 variables depending on the/drug uspd, x i3 

is the initial blood pressure of patients treated with drug Affnd x i4 
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is the initial blood pressure of t^iticnts treated with drug D. Graph- 
ically, this assumed behavior would appear as in Figure 12. The X 
matrix now appears as * . ■ 



/ 



r 


0 


Pi 


i 


0 


Pj 


i 


0 




l 


0 




l 


0 




l 


0 


PnA 


0 


1 


0 


0 


1 




0 


1 




0 














1 


'0 



0 



\ 



0 



/ 



TherS are several ways of stating null hypotheses for this study, 
the two lines are equiv 

(\ -1 0 0\ 
\0 0 1 -lj 



One way is to state that the two lines are equivalent to 

\ ( 1 -1 

0 



where row one tests equivalence of intercepts while row two tests 
.equivalence of the slopes. 



Observ ed 
Pressure 




Initial Pressure 



Fio. 1£. Treatment experiment with blood pressure response sensitive, to 
* initial blood pressure. 
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As a final example consider simultaneous drugs, A and D, wjieriff 

we are ;tcs(ing three different doses of drug A and two different 

° doses of drug B, A schematic yicw would appear as 



Al A2 A3 



r • 

01 1 

i 


1 
1 

1 1 


. 1 


i 
i 

3 1, 


■ 1 


i 

02 | 

1 1 


1 
1 

4 1 


i 
i 

5 1 


1 
1 

« 1 


t 



Thus, patients %rc trcatcd'with various combination doses of drugs 
A and D. An appropriate model here, would be 

y,P x u ai + x, 2 aa + x, 3 a 3 + x u a 4 + x l3 a 5 +'X| 6 a 6 . 

where; a t is the model parameter for a particular dose combination, 

Thus, a t is the parameter for AjBj, a 6 is the parameter for A 3 B^; 

x tJ is 1 for treatment combination j and 0 for other treatment 

combinations, Graphically we can visualize this experiment Us 

: shown in Figure 13. There arc three hypotheses of primary interest: 
* 

1. Is there any interaction between drugj\ and drug B? 

2. If there is no interaction, is there a dose effect with drug A 7 

3. If there is no interaction, is there a dose effect with drug B?' . 

the interaction hypothesis, tests whether the presence of one dfUg 
affects the response of another drug In a variable manner. Thus, ^e 



i 

I 
I 




Fia 13. Simultaneous dosage experiment. 
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^# , 

arc interested in wncthcr tho Dodo A profile is parallel to the Dose 
B profile, The test for parallelism is ^ , 

' " A,tti - A,Bj «• A,B, - A 3 B, 
AjDi - A,Bj - A,B, - A,Bj. 
Therefore, . ' t ■ . 

' Yl 0 -1 -1 0- 1 

' « , Qm \fi 1 -1 "0 -1 1 

' The test of a dose effect with drug A is derived by summing over 
the B doses such that 

dose l effect m \ { B V + AjD 2 

dose 2 effect m A a Bj + A 3 B a 

dose 3 effect ■* A 3 Bj + A 3 B 2 . 

The dose effect is assessed by testing 

dose I effect — dose 3 effect ==0 

dose 2 effect — dose 3^ effect « 0 

or , ' - " s ■ 

a\ c.fl o -4 1; o -i\ 

. ■ \^ \o i v -i o i — iy - 

Similarly, for the dose effect with drug B, 

_ CM1 I 1 1-1 -1 -1). 

These examples demonstrate the simplicity of hypothesis testing 
when one starts with a data 'source model. We have not been 
rigorous in the dcvclopmcht of the probability distributions associr 
atcd with testing these models. However, these tests arc quite 
robu$t, and «i practice with moderate-sized samples, the central 
limit theorem saves the day [4]. 
; , * '"" - - 

OTHER APPLICATION? OF. UNBAR MODELS , ; / f 

• Linear models have found very wide use m the analysis of experi- 
mental data. The fact that a Taylor expansion can be used to 
linearize a nonlinear mbdcl has been particularly usefuf in dealing 
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with biological data.. As stated^arlier, the assumption'of an experi- 
mental error, n e, that is independent of the dependent or response 
Variable*, . is necessary for useful pacameter estimation. However, 
there are a number of settings where the experimental error is not 
independent of the response variable. ; • j . ' " * 

One such case arises when dealing with frequency or counting* 
data. -Data obtained from radiation detectors has an unde^fying 
I^oisson distribution. For this distribution, the variance, c 2 , is equal 
to the mean, ''ji Thus, if one acquires a count of 10,006 over a 
one-minute fjeriod, the Variance is 10,000 and the standard, devi- 
ation is 100 counts/min/ However, if one acquires 1,000 counts over 
the same period, the standard deviation is approximately £2 

-. counts/min. Clearly, tfte Igwer the count, the greater the percentage 
error. The percentage error for the 10,000 counts is only ±1% 
while the percentage errbr for the 1,000 counts is ±3.2%. • 
Another case anses from the use of an algebraic transformation 

.to linearize a nonlinear model. For instance, the gamma function is 
used frequently to characterize indicator dilution data. Let C(t) be 
the concentration of an indicator at time t. The 4ime dependent 
behavior of C is expressed as 

■ V C(t) = Kt*e"^. V • / 

where k, a, and "fi are model parameters. By taking the natural 
logarithm of both sides of Jhis epilation we obtain . - t 

. InC(t) = K + aIrit-4 ^ V 

which is a linear model with independent variables 1, In t, & and t, 
. and model parameters k, a, and 1/jS.v 

Taking logarithms, however, modifies the error term, e. Initially, 
it is assumed that e is independent of C(t). The variance introduced 
by the transformation is proportional to 1/C 2 (t) 5 thereby violating^ 
the constanfvariance assumption. Y ~~ ." 

Estimation of model parameters from either of these two exam- 
ple^by minimizing 

\ (Y - XA)'(Y - XA)" 

results \in unreasonable estimates. Large deviations between, the 
data and the predicted data will be weighted more than small 
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deviations- To adjust for this effect, a weighted minimization of the 
foriti •,. , "v.;' • -'■ 7 



(Y - XA)'W(Y -XA) 



is used where 



w 



1 



"(nx n) 



0 



0 



Each individual weight, w>J is an estimate of the noise level or 
variance associated with the ithjiata point. Thus, a large 'deviation 
between the model agd the data , 

will be scaled appropriately. Parameters are estimated in the stan- 
dard way. Thus, 

r; V m^(Y - XA)'W(Y - XA) ■ 

yields . ...^--••-••--'^"^r"""'^^'' 

V-' ; ; .' . ^ = (X'WX) "^X' WY. ' • £ .'. 

Hypotheses of the form CA = 0 are tested by forming the statistic 
.•>7: „_ S H /rank(C) \ [ ' V 

k'.--^. ■ ;• ;•■ •" S^n-m) : ; >; ,V 

where • . ■'<■'■ ):■,■'■■ 

S H = A'C'tCfX'WXr/CT^A . 

' ■■and;? ;' . ■ , ■ ■ \- .:■ 

^ > S E = Y'WY - Y'WXA. 

- Observe that the differences in A, S H and S E , between the weighted 
model and the unweighted model are restricted to X'WX and 
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X'WY. Thus, if X ; WX and X'WY are accumulated instead of X'X 
and X'Y, the computer program for supporting the unweighted 
model becomes a program for^supporting a weighted'model. . 

The weighted modeh has* found considerable application when 
dealing with categorical or discrete valued observations. Grizzle et. 
al. [5] showed that many problems in categorical data analysis 
were simply weighted regression problems. The Grizzle approach 
has led to a unification of categorical data analysis procedures^ ust . 
as the unweighted linear model has unified the analysis. of continu- 
ous data. These are major strides toward in^proved effectiven6ss of 
statistical analyses, and their widespread introduction is linked in-* 
exorably with the computerization of the associated algorithms. 

discussion , '■' i 

Data analysis is a context-sensitive activity. For the first time, 
with Jiigh-speed computing and inexpensive bulk storage, we are 
able Realistically to match each experiment with an appropriate 
model Therefore, in addition to carrying out an experimental pro- 
tocol, the investigator can also develop a model relating experi- 
mental outcomes with the properties of each substrate. Manual 
computing resources limit the complexity of the investigator's 
model; worse, it can forc^ .inappropriate oversimplifications that 
obscure important effects.' High-speed computing; ho_wever, re- 
moves this constraint and allows the investigator to account for 
more properties Contributing to substrate heterogeneity. A model 
that accurately describes substrate properties improves our ability 
to separate-treatment effects from substrate effects. w 

High-speed computing facilities and techniques also provide new 
avenues for data analysis. Histograms, depicting the behavior of 
random variables, can be readily created and displayed, providing 
the investigator with a rich, graphical representation. Since histo- 
gram generation is no longer difficult, comparison of random vari- 
ables can be carried out-by-visual inspection of their histograms. 
Procedures for histogram comparison (Kplmogorov-Smirnov) can 
now be routinely performed for large data sets, since the sorting 
required to generate the histogram is only a minor issue, treatable 
as a single conceptual operation with which the analyst need not be 
burdened. 
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: High-speed computing also -provides new life for ora methods; 
Randomization of permutation te'sts, first suggested by Fisher, re- 
quires the computation pf*a prpb&bility distribution/ for e^ch data 
set that is derived from an experimental investigation. The prob- 
ability distribution is derived by first assuming the null hypothesis 
true, and then computing a measure of the treatment effect qp'^ach [ 
permutation of V the experimental data; This poses little difficulty 
when only a few observations are m the data set. However, manual 
methods bog down for, say, 20 or more' observations. High-speed 
computing provides a practical n\eans for preparing each permu- 
tation. These randomization tests seem ideally suited for toclayY 
computers and provide a class of "exact" tests that are free of 
assumptions about underlying probability distributions. 

Computing* opens new options for data analysis. These options 
suggest avenues for statistical research which will significantly aid 
the scientific investigator. Similarly, statisticians and their col- 
leagues are taking on problems that develop larger and larger 
amounts of data. The resulting data-management activities provide 
new directions, for research in computing hardware and software. 
Computing and statistics are irreversibly bound together. The evol- 
ution of computing hardware can Ao more ignore the area of data 
analysis than can the evolution of statistical methodology ignore 
'the tools that computer science provides. 
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